computer-vision

Text-to-Face Generation

3260 papers • 126 benchmarks • 313 datasets

This task has no description! Would you like to contribute one?

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in text-to-face-generation

Trend

Dataset

Best Model

Actions

No benchmarks available.

Libraries

i

Use these libraries to find text-to-face-generation models and implementations

Datasets

FFHQ-Text

Subtasks

No subtasks available.

Most implemented papers

Text2FaceGAN: Face Generation from Fine Grained Textual Descriptions

R. Shah, Osaid Rehman Nasir, S. Jha, Manraj Singh Grover, Yi Yu, Ajit Kumar•Sat Aug 31 2019

This paper generates captions for images in the CelebA dataset by creating an algorithm to automatically convert a list of attributes to a set of captions, and model the highly multi-modal problem of text to face generation as learning the conditional distribution of faces in same latent space.

48

Content

0

Paper Graph

Parallel and High-Fidelity Text-to-Lip Generation

Yi Ren, Jinglin Liu, Baoxing Huai, Zhiying Zhu, Wencan Huang, N. Yuan, Zhou Zhao•Tue Jul 13 2021

A parallel decoding model for fast and high-fidelity text-to-lip generation (ParaLip) is proposed that predicts the duration of the encoded linguistic features and model the target lip frames conditioned on the encode linguistic features with their duration in a non-autoregressive manner and incorporates the structural similarity index loss and adversarial learning.

10 0

Paper Graph

AI-generated characters for supporting personalized learning and well-being

Pat Pataranutaporn, Valdemar Danry, Misha Sra, J. Leong, Parinya Punpongsanon, Dan Novy, P. Maes•Tue Nov 30 2021

This Perspective highlights emerging positive use cases of AI-generated characters, specifically in supporting learning and well-being, and demonstrates an easy-to-use AI character generation pipeline to enable such outcomes.

307 0

Paper Graph

Unite and Conquer: Plug & Play Multi-Modal Synthesis Using Diffusion Models

W. G. C. Bandara, Nithin Gopalakrishnan Nair, Vishal M. Patel•Wed May 31 2023

Generating photos satisfying multiple constraints finds broad utility in the content creation industry. A key hurdle to accomplishing this task is the need for paired data consisting of all modalities (i.e., constraints) and their corresponding output. Moreover, existing methods need retraining using paired data across all modalities to introduce a new condition. This paper proposes a solution to this problem based on denoising diffusion probabilistic models (DDPMs). Our motivation for choosing diffusion models over other generative models comes from the flexible internal structure of diffusion models. Since each sampling step in the DDPM follows a Gaussian distribution, we show that there exists a closed-form solution for generating an image given various constraints. Our method can unite multiple diffusion models trained on multiple sub-tasks and conquer the combined task through our proposed sampling strategy. We also introduce a novel reliability parameter that allows using different off-the-shelf diffusion models trained across various datasets during sampling time alone to guide it to the desired outcome satisfying multiple constraints. We perform experiments on various standard multimodal tasks to demonstrate the effectiveness of our approach. More details can be found at: https://nithin-gk.github.io/projectpages/Multidiff

21 0

Paper Graph

Adding a benchmark result helps the community track progress.