CPGNet: Cascade Point-Grid Fusion Network for Real-Time LiDAR Semantic Segmentation (2022-04-21T00:00:00.000000Z)

TL;DR

CPGNet is proposed, which ensures both effectiveness and efficiency mainly by the following two techniques: the novel Point-Grid (PG) fusion block extracts semantic features mainly on the 2D projected grid for efficiency, while the proposed transformation consistency loss narrows the gap between the single-time model inference and TTA.

Abstract

LiDAR semantic segmentation essential for advanced autonomous driving is required to be accurate, fast, and easy-deployed on mobile platforms. Previous point-based or sparse voxel-based methods are far away from real-time applications since time-consuming neighbor searching or sparse 3D convolution are employed. Recent 2D projection-based methods, including range view and multi-view fusion, can run in real time, but suffer from lower accuracy due to information loss during the 2 $D$ projection. Besides, to improve the performance, previous methods usually adopt test time augmentation (TTA), which further slows down the inference process. To achieve a better speed-accuracy trade-off, we propose Cascade Point-Grid Fusion Network (CPGNet), which ensures both effectiveness and efficiency mainly by the following two techniques: 1) the novel Point-Grid (PG) fusion block extracts semantic features mainly on the 2D projected grid for efficiency, while summarizes both 2D and 3D features on 3D point for minimal information loss; 2) the proposed transformation consistency loss narrows the gap between the single-time model inference and TTA. The experiments on the SemanticKITTI and nuScenes benchmarks demonstrate that the CPGNet without ensemble models or TTA is comparable with the state-of-the-art RPVNet, while it runs 4.7 times faster.

Authors

Xiaoyan Li

1 papers

Gang Zhang

1 papers

Hongyu Pan

1 papers

TL;DR

Abstract

Authors

References33 items

DRINet: A Dual-Representation Iterative Learning Network for Point Cloud Segmentation

RPVNet: A Deep and Efficient Range-Point-Voxel Fusion Network for LiDAR Point Cloud Segmentation

Lite-HDSeg: LiDAR Semantic Segmentation Using Lite Harmonic Dense Convolutions

(AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection for Sparse Semantic Segmentation Network

AMVNet: Assertion-based Multi-View Fusion Network for LiDAR Semantic Segmentation

Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation

Multi Projection Fusion for Real-time Semantic Segmentation of 3D LiDAR Point Clouds

Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution

KPRNet: Improving projection-based LiDAR semantic segmentation

SqueezeSegV3: Spatially-Adaptive Convolution for Efficient Point-Cloud Segmentation

SalsaNext: Fast, Uncertainty-Aware Semantic Segmentation of LiDAR Point Clouds

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

RangeNet ++: Fast and Accurate LiDAR Semantic Segmentation

SalsaNet: Fast Road and Vehicle Segmentation in LiDAR Point Clouds for Autonomous Driving

4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks

KPConv: Flexible and Deformable Convolution for Point Clouds

SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences

nuScenes: A Multimodal Dataset for Autonomous Driving

PointPillars: Fast Encoders for Object Detection From Point Clouds

PointSeg: Real-Time Semantic Segmentation Based on 3D LiDAR Point Cloud

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

PointCNN: Convolution On X-Transformed Points

Rethinking Atrous Convolution for Semantic Image Segmentation

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

The Lovasz-Softmax Loss: A Tractable Surrogate for the Optimization of the Intersection-Over-Union Measure in Neural Networks

Pyramid Scene Parsing Network

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

Aggregated Residual Transformations for Deep Neural Networks

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

Deep Residual Learning for Image Recognition

Learning Deconvolution Network for Semantic Segmentation

Fully convolutional networks for semantic segmentation

“Efﬁcient inference with tensorrt,”

Field of Study

Journal Information

Name

Page

Venue Information

Name

Type

URL

Alternate Names