3260 papers • 126 benchmarks • 313 datasets
Image: Zhang et al
(Image credit: Papersgraph)
These leaderboards are used to track progress in 3d-human-dynamics-7
No benchmarks available.
Use these libraries to find 3d-human-dynamics-7 models and implementations
No subtasks available.
A general class of models called Dynamical Variational Autoencoders (DVAEs) are introduced that encompass a large subset of these temporal VAE extensions that not only model the latent space, but also model the temporal dependencies within a sequence of data vectors and/or corresponding latent vectors.
The approach is designed so it can learn from videos with 2D pose annotations in a semi-supervised manner and obtain state-of-the-art performance on the 3D prediction task without any fine-tuning.
This work presents perhaps the first approach for predicting a future 3D mesh model sequence of a person from past video input, and inspired by the success of autoregressive models in language modeling tasks, learns an intermediate latent space on which to predict the future.
This paper proposes InterDiff, a framework comprising two key steps: interaction diffusion, where a diffusion model is leverage to encode the distribution of future human-object interactions; and interaction correction, where a physics-informed predictor is introduced to correct denoised HOIs in a diffusion step.
This work enables autoregressive modeling of implicit avatars and introduces the notion of articulated observer points, which relate implicit states to the explicit surface of a parametric human body model, and demonstrates that encoding implicit surfaces as a set of height fields defined on articulated observer Points leads to significantly better generalization compared to a latent representation.
Adding a benchmark result helps the community track progress.