3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in zero-shot-segmentation-1
Use these libraries to find zero-shot-segmentation-1 models and implementations
No subtasks available.
An open-set object detector, called Grounding DINO, is presented by marrying Transformer-based detector DINO with grounded pre-training, which can detect arbitrary objects with human inputs such as category names or referring expressions, and performs remarkably well on all three settings.
This work proposes a system that can generate image segmentations based on arbitrary prompts at test time, and builds upon the CLIP model as a backbone which it extends with a transformer-based decoder that enables dense prediction.
This paper presents a new framework for open-vocabulary semantic segmentation with the pre-trained vision-language model, named Side Adapter Network (SAN), which significantly outperforms other counterparts, with up to 18 times fewer trainable parameters and 19 times faster inference speed.
OpenSeeD is the first to explore the potential of joint training on segmentation and detection, and hope it can be received as a strong baseline for developing a single model for both tasks in the open world.
This paper proposes a novel context-aware feature generation method for zero-shot segmentation named CaGNet, which inserts a contextual module in a segmentation network to capture the pixel-wise contextual information, which guides the process of generating more diverse and context- aware features from semantic word embeddings.
The proposed CLIP Surgery is a method that enables surgery-like modifica-tions for the inference architecture and features, for better explainability and enhancement in multiple open-vocabulary tasks and demonstrates remarkable improvements in open-vocabulary segmentation and multi-label improvements.
An extensive evaluation of Segment Anything Model's ability to segment medical images on a collection of 19 medical imaging datasets from various modalities and anatomies concludes that SAM shows impressive zero-shot segmentation performance for certain medical Imaging datasets, but moderate to poor performance for others.
A new paradigm toward the universal medical image segmentation, termed ‘One-Prompt Segmentation,’ which combines the strengths of one-shot and interactive methods and can adeptly handle the un-seen task in a single forward pass.
A learnable High-Quality Output Token is injected into SAM's mask decoder and is responsible for predicting the high-quality mask, which reuses and preserves the pre-trained model weights of SAM, while only introducing minimal additional parameters and computation.
This paper proposes an alternative strategy that combines a conventional probabilistic atlas-based segmentation with deep learning, enabling one to train a segmentation model for new MRI scans without the need for any manually segmented images.
Adding a benchmark result helps the community track progress.