3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in zero-shot-segmentation-10
Use these libraries to find zero-shot-segmentation-10 models and implementations
No subtasks available.
An open-set object detector, called Grounding DINO, is presented by marrying Transformer-based detector DINO with grounded pre-training, which can detect arbitrary objects with human inputs such as category names or referring expressions, and performs remarkably well on all three settings.
This work proposes a system that can generate image segmentations based on arbitrary prompts at test time, and builds upon the CLIP model as a backbone which it extends with a transformer-based decoder that enables dense prediction.
This paper presents a new framework for open-vocabulary semantic segmentation with the pre-trained vision-language model, named Side Adapter Network (SAN), which significantly outperforms other counterparts, with up to 18 times fewer trainable parameters and 19 times faster inference speed.
This paper proposes a novel context-aware feature generation method for zero-shot segmentation named CaGNet, which inserts a contextual module in a segmentation network to capture the pixel-wise contextual information, which guides the process of generating more diverse and context- aware features from semantic word embeddings.
OpenSeeD is the first to explore the potential of joint training on segmentation and detection, and hope it can be received as a strong baseline for developing a single model for both tasks in the open world.
The proposed CLIP Surgery is a method that enables surgery-like modifica-tions for the inference architecture and features, for better explainability and enhancement in multiple open-vocabulary tasks and demonstrates remarkable improvements in open-vocabulary segmentation and multi-label improvements.
An extensive evaluation of Segment Anything Model's ability to segment medical images on a collection of 19 medical imaging datasets from various modalities and anatomies concludes that SAM shows impressive zero-shot segmentation performance for certain medical Imaging datasets, but moderate to poor performance for others.
A new paradigm toward the universal medical image segmentation, termed ‘One-Prompt Segmentation,’ which combines the strengths of one-shot and interactive methods and can adeptly handle the un-seen task in a single forward pass.
A learnable High-Quality Output Token is injected into SAM's mask decoder and is responsible for predicting the high-quality mask, which reuses and preserves the pre-trained model weights of SAM, while only introducing minimal additional parameters and computation.
This paper proposes an alternative strategy that combines a conventional probabilistic atlas-based segmentation with deep learning, enabling one to train a segmentation model for new MRI scans without the need for any manually segmented images.
Adding a benchmark result helps the community track progress.