3260 papers • 126 benchmarks • 313 datasets
3D object recognition is the task of recognising objects from 3D data. Note that there are related tasks you can look at, such as 3D Object Detection which have more leaderboards. (Image credit: Look Further to Recognize Better)
(Image credit: Papersgraph)
These leaderboards are used to track progress in 3d-object-recognition-7
Use these libraries to find 3d-object-recognition-7 models and implementations
No subtasks available.
A novel framework, namely 3D Generative Adversarial Network (3D-GAN), which generates 3D objects from a probabilistic space by leveraging recent advances in volumetric convolutional networks and generative adversarial nets, and a powerful 3D shape descriptor which has wide applications in 3D object recognition.
This paper introduces two distinct network architectures of volumetric CNNs and examines multi-view CNNs, providing a better understanding of the space of methods available for object classification on 3D data.
A neural message passing approach to augment an input 3D indoor scene with new objects matching their surroundings by weighting messages through an attention mechanism, which significantly outperforms state-of-the-art approaches in terms of correctly predicting objects missing in a scene.
This work proposes a Multi-view Vision Transformer (MVT) for 3D object recognition, and develops a global-local structure for the MVT that takes much less inductive bias compared with its CNN counterparts.
This work presents an MLP-based architecture termed as Round-Roll MLP, which extends the spatial-shift MLP backbone by considering the communications between patches from different views and achieves competitive performance compared with existing state-of-the-art methods.
This work represents 3D spaces as volumetric fields, and proposes a novel design that employs field probing filters to efficiently extract features from them, showing that field probing is significantly more efficient than 3DCNNs, while providing state-of-the-art performance, on classification tasks for 3D object recognition benchmark datasets.
A unique 3D-CNN based Gradient-weighted Class Activation Mapping method (3D-GradCAM) for visual explanations of the distinct local geometric features of interest within an object to enable efficient learning of 3D geometries.
We propose the Variational Shape Learner (VSL), a generative model that learns the underlying structure of voxelized 3D shapes in an unsupervised fashion. Through the use of skip-connections, our model can successfully learn and infer a latent, hierarchical representation of objects. Furthermore, realistic 3D objects can be easily generated by sampling the VSL's latent probabilistic manifold. We show that our generative model can be trained end-to-end from 2D images to perform single image 3D model retrieval. Experiments show, both quantitatively and qualitatively, the improved generalization of our proposed model over a range of tasks, performing better or comparable to various state-of-the-art alternatives.
Adding a benchmark result helps the community track progress.