3260 papers • 126 benchmarks • 313 datasets
Class-agnostic object detection aims to localize objects in images without specifying their categories.
(Image credit: Papersgraph)
These leaderboards are used to track progress in class-agnostic-object-detection
No benchmarks available.
Use these libraries to find class-agnostic-object-detection models and implementations
No subtasks available.
This paper advocates that existing methods lack a top-down supervision signal governed by human-understandable semantics and demonstrates that Multi-modal Vision Transformers (MViT) trained with aligned image-text pairs can effectively bridge this gap.
We introduce MOVE, a novel method to segment objects without any form of supervision. MOVE exploits the fact that foreground objects can be shifted locally relative to their initial position and result in realistic (undistorted) new images. This property allows us to train a segmentation model on a dataset of images without annotation and to achieve state of the art (SotA) performance on several evaluation datasets for unsupervised salient object detection and segmentation. In unsupervised single object discovery, MOVE gives an average CorLoc improvement of 7.2% over the SotA, and in unsupervised class-agnostic object detection it gives a relative AP improvement of 53% on average. Our approach is built on top of self-supervised features (e.g. from DINO or MAE), an inpainting network (based on the Masked AutoEncoder) and adversarial training.
This work proposes Detecting Every Object in Events (DEOE), an approach aimed at achieving high-speed, class-agnostic object detection in event-based vision, and introduces a disentangled objectness head to separate the foreground-background classification and novel object discovery tasks.
The Dispersing Prompt Expansion (DiPEx) approach progressively learns to expand a set of distinct, non-overlapping hyperspherical prompts to enhance recall rates, thereby improving performance in downstream tasks such as out-of-distribution OD.
This work uses the geometric cues to train an object proposal network for pseudo-labeling unannotated novel objects in the training set, and significantly improves detection recall for novel object categories and already performs well with only a few training classes.
Adding a benchmark result helps the community track progress.