3260 papers • 126 benchmarks • 313 datasets
Image Animation is a field for image-animation of a source image by a driving video
(Image credit: Papersgraph)
These leaderboards are used to track progress in image-animation-1
No benchmarks available.
Use these libraries to find image-animation-1 models and implementations
No datasets available.
No subtasks available.
The proposed AnimateDiff is a practical framework for animating personalized T2I models without requiring model-specific tuning and MotionLoRA is a lightweight fine-tuning technique for AnimateDiff that enables a pre-trained motion module to adapt to new motion patterns, such as different shot types, at a low training and data collection cost.
This framework decouple appearance and motion information using a self-supervised formulation and uses a representation consisting of a set of learned keypoints along with their local affine transformations to support complex motions.
This work presents a novel approach for image-animation of a source image by a driving video, both depicting the same type of object, and is shown to greatly outperform the state-of-the-art methods on multiple benchmarks.
This work introduces Magic/snimate, a diffusion-based framework that aims at enhancing temporal consistency, preserving reference image faithfully, and improving animation fidelity, and outperforms the strongest baseline on the challenging TikTok dancing dataset.
This paper proposes a practical framework, named Follow-Your-Click, to achieve image animation with a simple user click and a short motion prompt and has simpler yet precise user control and better generation performance than previous methods.
This paper introduces a novel deep learning framework for image animation that generates a video in which the target object is animated according to the driving sequence through a deep architecture that decouples appearance and motion information.
We present the Creative Flow+ Dataset, the first diverse multi-style artistic video dataset richly labeled with per-pixel optical flow, occlusions, correspondences, segmentation labels, normals, and depth. Our dataset includes 3000 animated sequences rendered using styles randomly selected from 40 textured line styles and 38 shading styles, spanning the range between flat cartoon fill and wildly sketchy shading. Our dataset includes 124K+ train set frames and 10K test set frames rendered at 1500x1500 resolution, far surpassing the largest available optical flow datasets in size. While modern techniques for tasks such as optical flow estimation achieve impressive performance on realistic images and video, today there is no way to gauge their performance on non-photorealistic images. Creative Flow+ poses a new challenge to generalize real-world Computer Vision to messy stylized content. We show that learning-based optical flow methods fail to generalize to this data and struggle to compete with classical approaches, and invite new research in this area. Our dataset and a new optical flow benchmark will be publicly available at: www.cs.toronto.edu/creativeflow/. We further release the complete dataset creation pipeline, allowing the community to generate and stylize their own data on demand.
This paper proposes PriorityCut, a novel augmentation approach that uses the top-k percent occluded pixels of the foreground to regularize warp-based image animation and significantly reduces the warping artifacts in state-of-the-art warp- based image animation models on diverse datasets.
A differentiable global-flow local-attention framework to reassemble the inputs at the feature level and show that this framework can spatially transform the inputs in an efficient manner.
A new human face animation dataset is proposed, called DeepFake MNIST+1, generated by a SOTA image animation generator, which includes 10,000 facial animation videos in ten different actions, which can spoof the recent liveness detectors.
Adding a benchmark result helps the community track progress.