3260 papers • 126 benchmarks • 313 datasets
Small Object Detection is a computer vision task that involves detecting and localizing small objects in images or videos. This task is challenging due to the small size and low resolution of the objects, as well as other factors such as occlusion, background clutter, and variations in lighting conditions. ( Image credit: Feature-Fused SSD )
(Image credit: Papersgraph)
These leaderboards are used to track progress in small-object-detection
Use these libraries to find small-object-detection models and implementations
VoVNet also outperforms widely used ResNet backbone with faster speed and better energy efficiency and the small object detection performance has been significantly improved over DenseNet and ResNet.
This work proposed an architecture with three components: ESRGAN, EEN, and Detection network, and used different detector networks in an end-to-end manner where detector loss was backpropagated into the EESRGAN to improve the detection performance.
An open-source framework called Slicing Aided Hyper Inference (SAHI) is proposed that provides a generic slicing aided inference and fine-tuning pipeline for small object detection, and is integrated with Detectron2, MMDetection and YOLOv5 models.
This work analyzes the current state-of-the-art model, Mask-RCNN, on a challenging dataset, MS COCO, and shows that the overlap between small ground-truth objects and the predicted anchors is much lower than the expected IoU threshold.
A method called pixel level balancing (PLB), which takes into account the number of pixels contained in the detection box as an impact factor to characterize the size of the inspected objects, and uses this as animpact factor to improve the accuracy of small object detection.
A new dataset obtained from a real CCTV installed in a university and the generation of synthetic images are presented, resulting in a weapon detection model able to be used in quasi real-time CCTV improving the state of the art on weapon detection in a two stages training.
This work proposes a solution to reduce the dependence on labelled 3D training data by leveraging pre-training on large-scale unlabeled outdoor LiDAR point clouds using masked autoencoders (MAE), specifically designed for voxel-based large-scale outdoor LiDAR point clouds.
The rotated bounding box is converted to a 2-D Gaussian distribution, which enables to approximate the indifferentiable rotational IoU induced loss by the Gaussian Wasserstein distance which can be learned efficiently by gradient back-propagation.
This study explores how the popular YOLOv5 object detector can be modified to improve its performance in detecting smaller objects, with a particular application in autonomous racing.
This work first model the bounding boxes as 2D Gaussian distributions and then proposes a new metric dubbed Normalized Wasserstein Distance (NWD), which can be easily embedded into the assignment, non-maximum suppression, and loss function of any anchor-based detector to replace the commonly used IoU metric.
Adding a benchmark result helps the community track progress.