3260 papers • 126 benchmarks • 313 datasets
Currently, existing image segmentation tasks mainly focus on segmenting objects with specific characteristics, e.g., salient, camouflaged, meticulous, or specific categories. Most of them have the same input/output formats, and barely use exclusive mechanisms designed for segmenting targets in their models, which means almost all tasks are dataset-dependent. Thus, it is very promising to formulate a category-agnostic DIS task for accurately segmenting objects with different structure complexities, regardless of their characteristics. Compared with semantic segmentation, the proposed DIS task usually focuses on images with single or a few targets, from which getting richer accurate details of each target is more feasible.
(Image credit: Papersgraph)
These leaderboards are used to track progress in dichotomous-image-segmentation-11
Use these libraries to find dichotomous-image-segmentation-11 models and implementations
No subtasks available.
It is shown that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
The proposed `DeepLabv3' system significantly improves over the previous DeepLab versions without DenseCRF post-processing and attains comparable performance with other state-of-art models on the PASCAL VOC 2012 semantic image segmentation benchmark.
This paper exploits the capability of global context information by different-region-based context aggregation through the pyramid pooling module together with the proposed pyramid scene parsing network (PSPNet) to produce good quality results on the scene parsing task.
This paper starts the exploration of how automated search algorithms and network design can work together to harness complementary approaches improving the overall state of the art of MobileNets.
The superiority of the proposed HRNet in a wide range of applications, including human pose estimation, semantic segmentation, and object detection, is shown, suggesting that the HRNet is a stronger backbone for computer vision problems.
A novel Bilateral Segmentation Network (BiSeNet) is proposed that makes a right balance between the speed and segmentation performance on Cityscapes, CamVid, and COCO-Stuff datasets.
An image cascade network (ICNet) that incorporates multi-resolution branches under proper label guidance to address the challenging task of real-time semantic segmentation is proposed and in-depth analysis of the framework is provided.
A novel and efficient structure named Short-Term Dense Concatenate network (STDC network) is proposed by removing structure redundancy by gradually reducing the dimension of feature maps and use the aggregation of them for image representation, which forms the basic module of STDC network.
The F3Net is able to segment salient object regions accurately and provide clear local details and outperforms state-of-the-art approaches on six evaluation metrics.
Adding a benchmark result helps the community track progress.