3260 papers • 126 benchmarks • 313 datasets
Models that learn to segment each image (i.e. assign a class to every pixel) without seeing the ground truth labels. ( Image credit: SegSort: Segmentation by Discriminative Sorting of Segments )
(Image credit: Papersgraph)
These leaderboards are used to track progress in unsupervised-semantic-segmentation-10
Use these libraries to find unsupervised-semantic-segmentation-10 models and implementations
This work presents DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features and outperforms the current state of the art by a significant margin on all the standard benchmarks.
A novel clustering objective is presented that learns a neural network classifier from scratch, given only unlabelled data samples, and discovers clusters that accurately match semantic classes, achieving state-of-the-art results in eight unsupervised clustering benchmarks spanning image classification and segmentation.
The proposed approach dynamically sets different confidence thresholds according to the prediction variance, rectifies the learning from noisy pseudo labels, and achieves significant improvements over the conventional pseudo label learning and yields competitive performance on all three benchmarks.
This work proposes a new problem of large-scale unsupervised semantic segmentation (LUSS) with a newly created benchmark dataset to help the research progress and presents a simple yet effective method that works surprisingly well for LUSS.
A two-step framework that adopts a predetermined mid-level prior in a contrastive optimization objective to learn pixel embeddings and argues about the importance of having a prior that contains information about objects, or their parts, and discusses several possibilities to obtain such a prior in an unsupervised manner.
The method, PiCIE (Pixel-level feature Clustering using Invariance and Equivariance), is the first method capable of segmenting both things and stuff categories without any hyperparameter tuning or task-specific pre-processing.
STEGO is a novel framework that distills unsupervised features into high-quality discrete semantic labels that encourages features to form compact clusters while preserving their relationships across the corpora.
This work leverages the retrieval abilities of one language-image pre-trained model, CLIP, to dynamically curate training sets from unlabelled images for arbitrary collections of concept names, and leverage the robust correspondences offered by modern image representations to co-segment entities among the resulting collections.
An exhaustive analysis is conducted to illustrate that SILOP enhances existing state-of-the-art frameworks for image-level-based semantic segmentation, and proposes a framework that introduces an additional module using object perimeters for improved saliency.
Adding a benchmark result helps the community track progress.