3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in video-style-transfer-9
No benchmarks available.
Use these libraries to find video-style-transfer-9 models and implementations
No subtasks available.
A style-aware content loss is proposed, which is trained jointly with a deep encoder-decoder network for real-time, high-resolution stylization of images and videos and results show that this approach better captures the subtle nature in which a style affects content.
A novel real-time video style transfer model, ReCoNet, is proposed, which can generate temporally coherent style transfer videos while maintaining favorable perceptual styles and exhibits outstanding performance both qualitatively and quantitatively.
A novel attention and normalization module, named Adaptive Attention Normalization (AdaAttN), to adaptively perform attentive normalization on per-point basis is proposed, which achieves state-of-the-art arbitrary image/video style transfer.
A method that decomposes, and "unwraps", an input video into a set of layered 2D atlases, each providing a unified representation of the appearance of an object (or background) over the video, which does not require any prior 3D knowledge about scene geometry or camera poses.
We present the Creative Flow+ Dataset, the first diverse multi-style artistic video dataset richly labeled with per-pixel optical flow, occlusions, correspondences, segmentation labels, normals, and depth. Our dataset includes 3000 animated sequences rendered using styles randomly selected from 40 textured line styles and 38 shading styles, spanning the range between flat cartoon fill and wildly sketchy shading. Our dataset includes 124K+ train set frames and 10K test set frames rendered at 1500x1500 resolution, far surpassing the largest available optical flow datasets in size. While modern techniques for tasks such as optical flow estimation achieve impressive performance on realistic images and video, today there is no way to gauge their performance on non-photorealistic images. Creative Flow+ poses a new challenge to generalize real-world Computer Vision to messy stylized content. We show that learning-based optical flow methods fail to generalize to this data and struggle to compete with classical approaches, and invite new research in this area. Our dataset and a new optical flow benchmark will be publicly available at: www.cs.toronto.edu/creativeflow/. We further release the complete dataset creation pipeline, allowing the community to generate and stylize their own data on demand.
In recent years, neural style transfer has attracted more and more attention, especially for image style transfer. However, temporally consistent style transfer for videos is still a challenging problem. Existing methods, either relying on a significant amount of video data with optical flows or using single-frame regularizers, fail to handle strong motions or complex variations, therefore have limited performance on real videos. In this article, we address the problem by jointly considering the intrinsic properties of stylization and temporal consistency. We first identify the cause of the conflict between style transfer and temporal consistency, and propose to reconcile this contradiction by relaxing the objective function, so as to make the stylization loss term more robust to motions. Through relaxation, style transfer is more robust to inter-frame variation without degrading the subjective effect. Then, we provide a novel formulation and understanding of temporal consistency. Based on the formulation, we analyze the drawbacks of existing training strategies and derive a new regularization. We show by experiments that the proposed regularization can better balance the spatial and temporal performance. Based on relaxation and regularization, we design a zero-shot video style transfer framework. Moreover, for better feature migration, we introduce a new module to dynamically adjust inter-channel distributions. Quantitative and qualitative results demonstrate the superiority of our method over state-of-the-art style transfer methods. Our project is publicly available at: https://daooshee.github.io/ReReVST/.
This paper makes a mild and reasonable assumption that global inconsistency is dominated by local inconsistencies and devise a generic Contrastive Coherence Preserving Loss (CCPL) applied to local patches that can preserve the coherence of the content source during style transfer without degrading stylization.
F FateZero, a zero-shot text-based editing method on real-world videos without per-prompt training or use-specific mask, is proposed, which is the first one to show the ability of zero-shot text-driven video style and local attribute editing from the trained text-to-image model.
A new framework named CAP-VSTNet, which consists of a new reversible residual network and an unbiased linear transform module, for versatile style transfer is proposed, which can not only preserve content affinity but not introduce redundant information as traditional reversible networks, and hence facilitate better stylization.
Adding a benchmark result helps the community track progress.