computer-vision-10

Scene Parsing

3260 papers • 126 benchmarks • 313 datasets

Scene parsing is to segment and parse an image into different image regions associated with semantic categories, such as sky, road, person, and bed. MIT Description

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in scene-parsing-20

Trend

Dataset

Best Model

Actions

PGDP5K

Cityscapes test

Libraries

i

Use these libraries to find scene-parsing-20 models and implementations

PaddlePaddle/PaddleSeg

4 papers 8,417

Datasets

Subtasks

Scene Understanding Scene Graph Generation Scene Text Recognition Scene Recognition Scene Recognition

Most implemented papers

Pyramid Scene Parsing Network

Xiaogang Wang, Jiaya Jia, Jianping Shi, Xiaojuan Qi, Hengshuang Zhao•Sat Dec 03 2016

This paper exploits the capability of global context information by different-region-based context aggregation through the pyramid pooling module together with the proposed pyramid scene parsing network (PSPNet) to produce good quality results on the scene parsing task.

13635

Content

open-mmlab/mmsegmentation

3 papers 8,733

CSAILVision/semantic-segmentation-p…

2 papers 5,004

sithu31296/semantic-segmentation

2 papers 890

JanMarcelKezmann/TensorFlow-Advance…

2 papers 157

Face Parsing

Indoor Scene Synthesis

Indoor Scene Reconstruction

Scene Labeling

Street Scene Parsing

0

Paper Graph

Semantic Understanding of Scenes Through the ADE20K Dataset

Bolei Zhou, A. Torralba, Hang Zhao, Xavier Puig, S. Fidler, Adela Barriuso•Wed Aug 17 2016

This work presents a densely annotated dataset ADE20K, which spans diverse annotations of scenes, objects, parts of objects, and in some cases even parts of parts, and shows that the networks trained on this dataset are able to segment a wide variety of scenes and objects.

2128 0

Paper Graph

Panoptic Segmentation

Kaiming He, Ross B. Girshick, C. Rother, Piotr Dollár, Alexander Kirillov•Tue Jan 02 2018

A novel panoptic quality (PQ) metric is proposed that captures performance for all classes (stuff and things) in an interpretable and unified manner and is performed a rigorous study of both human and machine performance for PS on three existing datasets, revealing interesting insights about the task.

1586 0

Paper Graph

OCNet: Object Context Network for Scene Parsing

Jingdong Wang, Yuhui Yuan•Mon Sep 03 2018

This paper addresses the semantic segmentation task with a new context aggregation scheme named \emph{object context}, which focuses on enhancing the role of object information by using a dense relation matrix to serve as a surrogate for the binary relation matrix.

633 0

Paper Graph

Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes

Yuanduo Hong, Huihui Pan, Weichao Sun, Yisong Jia•Thu Jan 14 2021

Novel deep dual-resolution networks (DDRNets) are proposed for real-time semantic segmentation of road scenes and a new contextual information extractor named Deep Aggregation Pyramid Pooling Module (DAPPM) is designed to enlarge effective receptive fields and fuse multi-scale context.

348 0

Paper Graph

ResUNet-a: a deep learning framework for semantic segmentation of remotely sensed data

F. Diakogiannis, F. Waldner, P. Caccetta, Chen Wu•Sun Mar 31 2019

A novel deep learning architecture, \resuneta, is presented that combines ideas from various state of the art modules used in computer vision for semantic segmentation tasks that has better convergence properties and behaves well even under the presence of highly imbalanced classes.

1694 0

Paper Graph

Semantic Flow for Fast and Accurate Scene Parsing

Zhen Zhu, Xiangtai Li, Ansheng You, Houlong Zhao, Maoke Yang, Kuiyuan Yang, Yunhai Tong•Sun Feb 23 2020

This paper proposes a Flow Alignment Module (FAM) to learn Semantic Flow between feature maps of adjacent levels, and broadcast high-level features to high resolution features effectively and efficiently and exhibits superior performance over other real-time methods even on light-weight backbone networks.

421 0

Paper Graph

PSANet: Point-wise Spatial Attention Network for Scene Parsing

Chen Change Loy, Dahua Lin, Jiaya Jia, Jianping Shi, Hengshuang Zhao, Yi Zhang, Shu Liu•Fri Sep 07 2018

The point-wise spatial attention network (PSANet) is proposed to relax the local neighborhood constraint and achieves top performance on various competitive scene parsing datasets, including ADE20K, PASCAL VOC 2012 and Cityscapes, demonstrating its effectiveness and generality.

1089 0

Paper Graph

OneFormer: One Transformer to Rule Universal Image Segmentation

Humphrey Shi, Jitesh Jain, Jiacheng Li, M. Chiu, Ali Hassani, Nikita Orlov•Wed Nov 09 2022

This work proposes OneFormer, a universal image segmentation framework that unifies segmentation with a multi-task train-once design that outperforms specialized Mask2Former models across all three segmentation tasks on ADE20k, Cityscapes, and COCO.

499 0

Paper Graph

Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition

Ling Shao, Fan Zhu, Li Liu, Ionut Cosmin Duta•Fri Jun 19 2020

Different architectures based on PyConv are presented for four main tasks on visual recognition: image classification, video action classification/recognition, object detection and semantic image segmentation/parsing, showing significant improvements over all these core tasks in comparison with the baselines.

222 0

Paper Graph

Adding a benchmark result helps the community track progress.