3260 papers • 126 benchmarks • 313 datasets
3D Object Classification is the task of predicting the class of a 3D object point cloud. It is a voxel level prediction where each voxel is classified into a category. The popular benchmark for this task is the ModelNet dataset. The models for this task are usually evaluated with the Classification Accuracy metric. Image: Sedaghat et al
(Image credit: Papersgraph)
These leaderboards are used to track progress in 3d-object-classification-21
Use these libraries to find 3d-object-classification-21 models and implementations
The proposed spherical kernel for efficient graph convolution of 3D point clouds maintains translation-invariance and asymmetry properties, where the former guarantees weight sharing among similar local structures in the data and the latter facilitates fine geometric learning.
GDANet introduces Geometry-Disentangle Module to dynamically disentangle point clouds into the contour and flat part of 3D objects, respectively denoted by sharp and gentle variation components and exploits Sharp-Gentle Complementary Attention Module that regards the features from sharp and Gentle variation components as two holistic representations.
Experimental results reveal PointLLM's superior performance over existing 2D and 3D baselines, with a notable achievement in human-evaluated object captioning tasks where it surpasses human annotators in over 50% of the samples.
This work generalizes the convolution operator from regular grids to arbitrary graphs while avoiding the spectral domain, which allows us to handle graphs of varying size and connectivity.
This work proposes SortNet, as part of the Point Transformer, which induces input permutation invariance by selecting points based on a learned score, to extract local and global features and relate both representations by introducing the local-global attention mechanism.
3D capsule networks are proposed, an auto-encoder designed to process sparse 3D point clouds while preserving spatial arrangements of the input data and enables new applications such as part interpolation and replacement.
This work proposes a method to incrementally build up semantic scene graphs from a 3D environment given a sequence of RGB-D frames by means of a graph neural network and proposes a novel attention mechanism well suited for partial and missing graph data present in such an incremental reconstruction scenario.
This paper proposes PointMixer, a universal point set operator that facilitates information sharing among unstructured 3D points by simply replacing token-mixing MLPs with a softmax function, which can be broadly used in the network as inter-set mixing, intra- set mixing, and pyramid mixing.
Uni3D, a 3D foundation model to explore the unified 3D representation at scale, efficiently scale up Uni3D to one billion parameters, and set new records on a broad range of 3D tasks, such as zero-shot classification, few- shot classification, open-world understanding and part segmentation.
A more realistic and challenging scenario named open-pose 3D zero-shot classification, focusing on the recognition of 3D objects regardless of their orientation, is proposed, to make validation more compelling and not just limited to existing CLIP-based methods.
Adding a benchmark result helps the community track progress.