3260 papers • 126 benchmarks • 313 datasets
Semantic Segmentation is a computer vision task that involves assigning a semantic label to each pixel in an image. In Real-Time Semantic Segmentation, the goal is to perform this labeling quickly and accurately in real-time, allowing for the segmentation results to be used for tasks such as object recognition, scene understanding, and autonomous navigation. ( Image credit: TorchSeg )
(Image credit: Papersgraph)
These leaderboards are used to track progress in real-time-semantic-segmentation-9
Use these libraries to find real-time-semantic-segmentation-9 models and implementations
No subtasks available.
Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet.
This paper exploits the capability of global context information by different-region-based context aggregation through the pyramid pooling module together with the proposed pyramid scene parsing network (PSPNet) to produce good quality results on the scene parsing task.
A novel deep neural network architecture named ENet (efficient neural network), created specifically for tasks requiring low latency operation, which is up to 18 times faster, requires 75% less FLOPs, has 79% less parameters, and provides similar or better accuracy to existing models.
The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.
This paper introduces fast segmentation convolutional neural network (Fast-SCNN), an above real-time semantic segmentation model on high resolution image data suited to efficient computation on embedded devices with low memory.
It is suggested that memory traffic for accessing intermediate feature maps can be a factor dominating the inference latency, especially in such tasks as real-time object detection and semantic segmentation of high-resolution video.
A novel Bilateral Segmentation Network (BiSeNet) is proposed that makes a right balance between the speed and segmentation performance on Cityscapes, CamVid, and COCO-Stuff datasets.
State-of-the-art results in object detection (from the authors' mask byproduct) and panoptic segmentation show the potential to serve as a new strong baseline for many instance-level recognition tasks besides instance segmentation.
An image cascade network (ICNet) that incorporates multi-resolution branches under proper label guidance to address the challenging task of real-time semantic segmentation is proposed and in-depth analysis of the framework is provided.
An efficient high-resolution network, Lite-HRNet, is presented, which demonstrates superior results on human pose estimation over popular lightweight networks and can be easily applied to semantic segmentation task in the same lightweight manner.
Adding a benchmark result helps the community track progress.