3260 papers • 126 benchmarks • 313 datasets
Universal segmentation is a challenging computer vision task that aims to segment images into semantic regions, regardless of the task or the domain. It requires the model to learn a wide range of visual concepts and to be able to generalize to new tasks and domains.
(Image credit: Papersgraph)
These leaderboards are used to track progress in universal-segmentation-7
No benchmarks available.
Use these libraries to find universal-segmentation-7 models and implementations
No datasets available.
No subtasks available.
This work proposes OneFormer, a universal image segmentation framework that unifies segmentation with a multi-task train-once design that outperforms specialized Mask2Former models across all three segmentation tasks on ADE20k, Cityscapes, and COCO.
Mask2Former is presented, a new archi-tecture capable of addressing any image segmentation task (panoptic, instance or semantic), and sets a new state-of-the-art for panoptic segmentation, instance segmentation and semantic segmentation.
This work modified the traditional binary training targets to include three classes for direct instance segmentation, and found that segmentation performance can benefit from group normalization layer and Atrous Spatial Pyramid Pooling module, thanks to their more reliable statistics estimation and improved semantic understanding.
This work proposes a novel decoding mechanism that enables diverse prompting for all types of segmentation tasks, aiming at a universal segmentation interface that behaves like large language models (LLMs).
A prompt-driven Universal Segmentation model (UniSeg) for multi-task medical image segmentation using diverse modalities and domains is proposed, which outperforms other universal models and single-task models on 11 upstream tasks and beats other pre-trained models on two downstream datasets.
These innovations closely link CLUSTSEG to EM clustering and make it a transparent and powerful framework that yields superior results across the above segmentation tasks.
The resulting model, named HIPIE, tackles HIerarchical, oPen-vocabulary, and unIvErsal segmentation tasks within a unified framework and achieves the state-of-the-art results at various levels of image comprehension.
An Unsupervised Universal Segmentation model (U2Seg) adept at performing various image segmentation tasks-instance, semantic and panoptic-using a novel uni-fied framework is proposed, which sets up a new baseline for unsupervised panoptic segmentation, which has not been previously explored.
This paper uses a universal segmentation method as the visual encoder, integrating image information, perception priors, and visual prompts into visual tokens provided to the LLM, and proposes perception prior embedding to better integrate perception priors with image features.
Unified segmentation model UniLSeg is presented, a universal segmentation model that can perform segmentation at any semantic level with the guidance of language instructions, surpassing both specialist and unified segmentation models.
Adding a benchmark result helps the community track progress.