3260 papers • 126 benchmarks • 313 datasets
Semantic Segmentation is a computer vision task that involves assigning a semantic label to each pixel in an image. In Real-Time Semantic Segmentation, the goal is to perform this labeling quickly and accurately in real-time, allowing for the segmentation results to be used for tasks such as object recognition, scene understanding, and autonomous navigation. ( Image credit: TorchSeg )
(Image credit: Papersgraph)
These leaderboards are used to track progress in real-time-semantic-segmentation-7
Use these libraries to find real-time-semantic-segmentation-7 models and implementations
No subtasks available.
Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet.
This paper exploits the capability of global context information by different-region-based context aggregation through the pyramid pooling module together with the proposed pyramid scene parsing network (PSPNet) to produce good quality results on the scene parsing task.
A novel deep neural network architecture named ENet (efficient neural network), created specifically for tasks requiring low latency operation, which is up to 18 times faster, requires 75% less FLOPs, has 79% less parameters, and provides similar or better accuracy to existing models.
The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.
This paper introduces fast segmentation convolutional neural network (Fast-SCNN), an above real-time semantic segmentation model on high resolution image data suited to efficient computation on embedded devices with low memory.
A novel Bilateral Segmentation Network (BiSeNet) is proposed that makes a right balance between the speed and segmentation performance on Cityscapes, CamVid, and COCO-Stuff datasets.
State-of-the-art results in object detection (from the authors' mask byproduct) and panoptic segmentation show the potential to serve as a new strong baseline for many instance-level recognition tasks besides instance segmentation.
It is suggested that memory traffic for accessing intermediate feature maps can be a factor dominating the inference latency, especially in such tasks as real-time object detection and semantic segmentation of high-resolution video.
An image cascade network (ICNet) that incorporates multi-resolution branches under proper label guidance to address the challenging task of real-time semantic segmentation is proposed and in-depth analysis of the framework is provided.
An efficient high-resolution network, Lite-HRNet, is presented, which demonstrates superior results on human pose estimation over popular lightweight networks and can be easily applied to semantic segmentation task in the same lightweight manner.
Adding a benchmark result helps the community track progress.