3260 papers • 126 benchmarks • 313 datasets
Face generation is the task of generating (or interpolating) new faces from an existing dataset. The state-of-the-art results for this task are located in the Image Generation parent. ( Image credit: Progressive Growing of GANs for Improved Quality, Stability, and Variation )
(Image credit: Papersgraph)
These leaderboards are used to track progress in face-generation-1
No benchmarks available.
Use these libraries to find face-generation-1 models and implementations
A new training methodology for generative adversarial networks is described, starting from a low resolution, and adding new layers that model increasingly fine details as training progresses, allowing for images of unprecedented quality.
One of the main challenges in feature learning using Deep Convolutional Neural Networks (DCNNs) for large-scale face recognition is the design of appropriate loss functions that can enhance the discriminative power. Centre loss penalises the distance between deep features and their corresponding class centres in the Euclidean space to achieve intra-class compactness. SphereFace assumes that the linear transformation matrix in the last fully connected layer can be used as a representation of the class centres in the angular space and therefore penalises the angles between deep features and their corresponding weights in a multiplicative way. Recently, a popular line of research is to incorporate margins in well-established loss functions in order to maximise face class separability. In this paper, we propose an Additive Angular Margin Loss (ArcFace) to obtain highly discriminative features for face recognition. The proposed ArcFace has a clear geometric interpretation due to its exact correspondence to geodesic distance on a hypersphere. We present arguably the most extensive experimental evaluation against all recent state-of-the-art face recognition methods on ten face recognition benchmarks which includes a new large-scale image database with trillions of pairs and a large-scale video dataset. We show that ArcFace consistently outperforms the state of the art and can be easily implemented with negligible computational overhead. To facilitate future research, the code has been made available.
This paper presents a simple method for “do as I do” motion transfer: given a source video of a person dancing, it is shown that it can transfer that performance to a novel (amateur) target after only a few minutes of the target subject performing standard moves.
This work presents a generic image-to-image translation framework, pixel2style2pixel (pSp), based on a novel encoder network that directly generates a series of style vectors which are fed into a pretrained StyleGAN generator, forming the extended latent space.
Faces Learned with an Articulated Model and Expressions is low-dimensional but more expressive than the FaceWarehouse model and the Basel Face Model and is compared to these models by fitting them to static 3D scans and 4D sequences using the same optimization method.
A novel GAN conditioning scheme based on Action Units (AU) annotations, which describes in a continuous manifold the anatomical facial movements defining a human expression, and proposes a fully unsupervised strategy to train the model, that only requires images annotated with their activated AUs.
This work proposes a novel framework, called InterFaceGAN, for semantic face editing by interpreting the latent semantics learned by GANs, and finds that the latent code of well-trained generative models actually learns a disentangled representation after linear transformations.
A multi-branch translation model architecture and a two-stage training process is proposed for generating fine-grained cartoon faces for various groups, such as women, men, kids and the elderly.
This work proposes a novel attributes encoder for extracting multi-level target face attributes, and a new generator with carefully designed Adaptive Attentional Denormalization layers to adaptively integrate the identity and the attributes for face synthesis.
This paper presents Arc2Face, an identity-conditioned face foundation model, which, given the ArcFace embedding of a person, can generate diverse photo-realistic images with an unparalleled degree of face similarity than existing models. Despite previous attempts to decode face recognition features into detailed images, we find that common high-resolution datasets (e.g. FFHQ) lack sufficient identities to reconstruct any subject. To that end, we meticulously upsample a significant portion of the WebFace42M database, the largest public dataset for face recognition (FR). Arc2Face builds upon a pretrained Stable Diffusion model, yet adapts it to the task of ID-to-face generation, conditioned solely on ID vectors. Deviating from recent works that combine ID with text embeddings for zero-shot personalization of text-to-image models, we emphasize on the compactness of FR features, which can fully capture the essence of the human face, as opposed to hand-crafted prompts. Crucially, text-augmented models struggle to decouple identity and text, usually necessitating some description of the given face to achieve satisfactory similarity. Arc2Face, however, only needs the discriminative features of ArcFace to guide the generation, offering a robust prior for a plethora of tasks where ID consistency is of paramount importance. As an example, we train a FR model on synthetic images from our model and achieve superior performance to existing synthetic datasets.
Adding a benchmark result helps the community track progress.