3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in multi-view-3d-reconstruction-11
Use these libraries to find multi-view-3d-reconstruction-11 models and implementations
No subtasks available.
Experimental results on the ShapeNet and Pix3D benchmarks indicate that the proposed Pix2Vox outperforms state-of-the-arts by a large margin, and the proposed method is 24 times faster than 3D-R2N2 in terms of backward inference time.
A new feed-forward neural module, named AttSets, together with a dedicated training algorithm, named FASet, to attentively aggregate an arbitrarily sized deep feature set for multi-view 3D reconstruction, which significantly outperforms existing aggregation approaches.
It is demonstrated that the proposed SDFDiff, a novel approach for image-based shape optimization using differentiable rendering of 3D shapes represented by signed distance functions, can be integrated with deep learning models, which opens up options for learning approaches on 3D objects without 3D supervision.
This work proposes a differentiable rendering formulation for implicit shape and texture representations, showing that depth gradients can be derived analytically using the concept of implicit differentiation, and finds that this method can be used for multi-view 3D reconstruction, directly resulting in watertight meshes.
A probabilistic graphical model is utilized to embed planar models into PatchMatch multi-view stereo and contribute a novel multi- view aggregated matching cost that takes both photometric consistency and planar compatibility into consideration, making it suited for the depth estimation of both non-planar and planars regions.
This paper proposes a simple and practical solution to overcome the challenge of establishing cross-view correspondences where photometric constancy is violated based on a co-located camera-light scanner device and develops an optimization algorithm that robustly recovers globally optimal shape and reflectance even from a random initialization.
This work presents an effective self-supervised training scheme and novel loss design for dense descriptor learning and demonstrates that the proposed dense descriptor can generalize to unseen patients and scopes, thereby largely improving the performance of Structure from Motion (SfM) in terms of model density and completeness.
This paper endows coordinate-based representations with a probabilistic shape prior that enables faster convergence and better generalization when using few input images, and achieves high-fidelity head reconstructions with a high level of detail that consistently outperforms both state-of-the-art 3D Morphable Models methods in the few-shot scenario, and nonparametric methods when large sets of views are available.
LegoFormer is proposed, a transformer model for voxel-based 3D reconstruction that uses the attention layers to share information among views during all computational stages, and to parametrize the output with a series of low-rank decomposition factors.
3D-RETR is proposed, which is able to perform end-to-end 3D REconstruction with TRansformers, and is capable of 3D reconstruction from a single view or multiple views.
Adding a benchmark result helps the community track progress.