3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in image-to-3d-9
No benchmarks available.
Use these libraries to find image-to-3d-9 models and implementations
No datasets available.
No subtasks available.
This paper presents the Collaborative Neural Rendering (CoNR) method, which creates new images for specified poses from a few reference images (AKA Character Sheets), and collects a character sheet dataset containing over 700,000 hand-drawn and synthesized images of diverse poses to facilitate research in this area.
This work shows how one can learn transformations with no training examples by learning them on another domain and then transfer to the target domain, and demonstrates this on an image retrieval task where search query is an image, plus an additional transformation specification.
Experiments show that SyncDreamer generates images with high consistency across different views, thus making it well-suited for various 3D generation tasks such as novel-view-synthesis, text-to-3D, and image-to -3D.
Multi-view ControlNet (MVControl) is introduced, a novel neural network architecture designed to enhance existing pre-trained multi-view diffusion models by integrating additional input conditions, such as edge, depth, normal, and scribble maps, to address controllable text-to-3D generation.
InstantMesh is presented, a feed-forward framework for instant 3D mesh generation from a single image, featuring state-of-the-art generation quality and significant training scalability, and is able to create diverse 3D assets within 10 seconds.
A Deep Conditional Variational Autoencoder based model that synthesizes diverse anatomically plausible 3D-pose samples conditioned on the estimated 2D- pose is proposed, and it is shown that CVAE-based 3d-pose sample set is consistent with the 2D to 3D lifting and helps tackling the inherent ambiguity in2D-to-3D lifting.
A flow of aggressive optimization strategies are developed to reinforce the unperceptive generation of adversarial examples toward misleading victim models, and a local aggressive adversarial attacks (L3A) is proposed to solve above issues.
This work presents a discrete descriptor, which can represent the object surface densely by incorporating a hierarchical binary grouping, and proposes a coarse to fine training strategy, which enables fine-grained correspondence prediction of the 6DoF pose.
This work proposes a novel framework, dubbed NeuralLift-360, that utilizes a depth-aware neural radiance representation (NeRF) and learns to craft the scene guided by denoising diffusion models and can be guided with rough depth estimation in the wild.
Adding a benchmark result helps the community track progress.