3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in single-object-discovery-5
Use these libraries to find single-object-discovery-5 models and implementations
No subtasks available.
This paper questions if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets) and implements DINO, a form of self-distillation with no labels, which implements the synergy between DINO and ViTs.
This work proposes a simple approach, LOST, that leverages the activation features of a vision transformer pre-trained in a self-supervised manner, that outperform state-of-the-art object discovery methods by up to 8 CorLoc points on PASCAL VOC 2012.
This work focuses here on the unsupervised discovery and matching of object cate- gories among images in a collection, and shows that the original approach can be reformulated and solved as a proper optimization problem.
A novel saliency-based region proposal algorithm is proposed that achieves significantly higher overlap with ground-truth objects than other competitive methods and exploits the inherent hierarchical structure of proposals as an effective regularizer for the approach to object discovery.
This work proposes a novel formulation of UOD as a ranking problem, amenable to the arsenal of distributed methods available for eigenvalue problems and link analysis, and demonstrates the first effective fully unsupervised pipeline for UOD.
A graph-based method that uses the selfsupervised transformer features to discover an object from an image using spectral clustering with generalized eigen-decomposition and showing that the second smallest eigenvector provides a cutting solution since its absolute value indicates the likelihood that a token belongs to a foreground object.
We introduce MOVE, a novel method to segment objects without any form of supervision. MOVE exploits the fact that foreground objects can be shifted locally relative to their initial position and result in realistic (undistorted) new images. This property allows us to train a segmentation model on a dataset of images without annotation and to achieve state of the art (SotA) performance on several evaluation datasets for unsupervised salient object detection and segmentation. In unsupervised single object discovery, MOVE gives an average CorLoc improvement of 7.2% over the SotA, and in unsupervised class-agnostic object detection it gives a relative AP improvement of 53% on average. Our approach is built on top of self-supervised features (e.g. from DINO or MAE), an inpainting network (based on the Masked AutoEncoder) and adversarial training.
Adding a benchmark result helps the community track progress.