3260 papers • 126 benchmarks • 313 datasets
Image to Video Generation refers to the task of generating a sequence of video frames based on a single still image or a set of still images. The goal is to produce a video that is coherent and consistent in terms of appearance, motion, and style, while also being temporally consistent, meaning that the generated video should look like a coherent sequence of frames that are temporally ordered. This task is typically tackled using deep generative models, such as Generative Adversarial Networks (GANs) or Variational Autoencoders (VAEs), that are trained on large datasets of videos. The models learn to generate plausible video frames that are conditioned on the input image, as well as on any other auxiliary information, such as a sound or text track.
(Image credit: Papersgraph)
These leaderboards are used to track progress in image-to-video-generation-13
No benchmarks available.
Use these libraries to find image-to-video-generation-13 models and implementations
No datasets available.
No subtasks available.
Adding a benchmark result helps the community track progress.