3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in unconditional-video-generation-9
Use these libraries to find unconditional-video-generation-9 models and implementations
No subtasks available.
A two-stage MOtion, Scene and Object decomposition framework (MOSO) that achieves new state-of-the-art performance on five challenging benchmarks for video prediction and unconditional video generation and can be easily extended to unconditionalVideo generation and video frame interpolation tasks.
This work proposes studying the effects of Neural Differential Equations to model the temporal dynamics of video generation, and investigates how changes in temporal models affect generated video quality.
This work proposes a large-scale, high-quality, and diverse video dataset with rich facial attribute annotations, named the High-Quality Celebrity Video Dataset (CelebV-HQ), and conducts a comprehensive analysis in terms of age, ethnicity, brightness stability, motion smoothness, head pose diversity, and data quality.
This work designs a novel framework to build the motion space, aiming to achieve content consistency and fast convergence for video generation, and proposes an image pair generator named MotionStyleGAN to generate image pairs sharing the same contents and producing various motions.
A local-global context guidance strategy to capture the multi-perceptual embedding of the past fragment to boost the consistency of future prediction and a two-stage training strategy to alleviate the effect of noisy frames for more stable predictions.
The interpretable nature of DDLP allows us to perform ``what-if'' generation -- predict the consequence of changing properties of objects in the initial frames, and DLP's compact structure enables efficient diffusion-based unconditional video generation.
A novel motion generator design that uses a learning-based inversion network for GAN and enjoys the advantage of sparse training and naturally constrains the generation space of the motion generator with the inversion network guided by the initial frame, eliminating the need for heavy discriminators.
This work proposes multi-scale deep feature warping (MSDFW), which warps the intermediate features of a pre-trained StyleGAN at different resolutions, and generates cinemagraphs automatically from a still landscape image using a pre-trained StyleGAN.
Adding a benchmark result helps the community track progress.