3260 papers • 126 benchmarks • 313 datasets
Conditional image generation is the task of generating new images from a dataset conditional on their class. ( Image credit: PixelCNN++ )
(Image credit: Papersgraph)
These leaderboards are used to track progress in image-generation
Use these libraries to find image-generation models and implementations
This work introduces a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrates that they are a strong candidate for unsupervised learning.
This work redesigns the generator normalization, revisit progressive growing, and regularize the generator to encourage good conditioning in the mapping from latent codes to images, and thereby redefines the state of the art in unconditional image modeling.
This work proposes an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input, which performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning.
A variant of GANs employing label conditioning that results in 128 x 128 resolution image samples exhibiting global coherence is constructed and it is demonstrated that high resolution samples provide class information not present in low resolution samples.
It is demonstrated, on several datasets, that good results are now possible using only a few thousand training images, often matching StyleGAN2 results with an order of magnitude fewer images, and is expected to open up new application domains for GANs.
The proposed SAGAN achieves the state-of-the-art results, boosting the best published Inception score from 36.8 to 52.52 and reducing Frechet Inception distance from 27.62 to 18.65 on the challenging ImageNet dataset.
A new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs) is presented, which significantly outperforms existing methods, advancing both the quality and the resolution of deep image synthesis and editing.
It is found that applying orthogonal regularization to the generator renders it amenable to a simple "truncation trick," allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the Generator's input.
It is shown that diffusion models can achieve image sample quality superior to the current state-of-the-art generative models, and classifier guidance combines well with upsampling diffusion models, further improving FID to 3.94 on ImageNet 256$\times$256 and 3.85 on imageNet 512$\ times$512.
This work focuses on two applications of GANs: semi-supervised learning, and the generation of images that humans find visually realistic, and presents ImageNet samples with unprecedented resolution and shows that the methods enable the model to learn recognizable features of ImageNet classes.
Adding a benchmark result helps the community track progress.