3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in image-manipulation-10
Use these libraries to find image-manipulation-10 models and implementations
No subtasks available.
SinGAN, an unconditional generative model that can be learned from a single natural image, is introduced, trained to capture the internal distribution of patches within the image, and is then able to generate high quality, diverse samples that carry the same visual content as the image.
This work examines the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner and proposes a closedform factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
This paper carefully study the latent space of StyleGAN, the state-of-the-art unconditional generator, and suggests two principles for designing encoders in a manner that allows one to control the proximity of the inversions to regions that StyleGAN was originally trained on.
This work proposes a novel framework termed MaskGAN, enabling diverse and interactive face manipulation, and finds that semantic masks serve as a suitable intermediate representation for flexible face manipulation with fidelity preservation.
The proposed SRFlow is a normalizing flow based super-resolution method capable of learning the conditional distribution of the output given the low-resolution input, and directly accounts for the ill-posed nature of the problem, and learns to predict diverse photo-realistic high-resolution images.
The existing Neural Style Transfer method is extended to introduce control over spatial location, colour information and across spatial scale, enabling the combination of style information from multiple sources to generate new, perceptually appealing styles from existing ones.
The proposed MaskGIT is a novel image synthesis paradigm using a bidirectional transformer decoder that significantly outperforms the state-of-the-art transformer model on the ImageNet dataset, and accelerates autoregressive decoding by up to 48x.
This work explores leveraging the power of recently introduced Contrastive Language-Image Pre-training (CLIP) models in order to develop a text-based interface for StyleGAN image manipulation that does not require such manual effort.
This work proposes DragGAN, a powerful yet much less explored way of controlling GANs to "drag" any points of the image to precisely reach target points in a user-interactive manner, which consists of a feature-based motion supervision that drives the handle point to move towards the target position, and a new point tracking approach that leverages the discriminative generator features to keep localizing the position of the handle points.
This work proposes a novel framework, called InterFaceGAN, for semantic face editing by interpreting the latent semantics learned by GANs, and finds that the latent code of well-trained generative models actually learns a disentangled representation after linear transformations.
Adding a benchmark result helps the community track progress.