3260 papers • 126 benchmarks • 313 datasets
Face sketch synthesis is the task of generating a sketch from an input face photo. ( Image credit: High-Quality Facial Photo-Sketch Synthesis Using Multi-Adversarial Networks )
(Image credit: Papersgraph)
These leaderboards are used to track progress in face-sketch-synthesis-22
Use these libraries to find face-sketch-synthesis-22 models and implementations
No subtasks available.
In this work, we propose TediGAN, a novel framework for multi-modal image generation and manipulation with textual descriptions. The proposed method consists of three components: StyleGAN inversion module, visual-linguistic similarity learning, and instance-level optimization. The inversion module maps real images to the latent space of a well-trained StyleGAN. The visual-linguistic similarity learns the text-image matching by mapping the image and text into a common embedding space. The instancelevel optimization is for identity preservation in manipulation. Our model can produce diverse and high-quality images with an unprecedented resolution at 10242. Using a control mechanism based on style-mixing, our TediGAN inherently supports image synthesis with multi-modal inputs, such as sketches or semantic labels, with or without instance guidance. To facilitate text-guided multi-modal synthesis, we propose the Multi-Modal CelebA-HQ, a large-scale dataset consisting of real face images and corresponding semantic segmentation map, sketch, and textual descriptions. Extensive experiments on the introduced dataset demonstrate the superior performance of our proposed method. Code and data are available at https://github.com/weihaox/TediGAN.
A novel synthesis framework called Photo-Sketch Synthesis using Multi-Adversarial Networks, (PS2-MAN) that iteratively generates low resolution to high resolution images in an adversarial way that leverages the pair information using CycleGAN framework.
A semi-supervised deep learning architecture is proposed which extends face sketch synthesis to handle face photos in the wild by exploiting additional facePhotos in training by performing patch matching in feature space between the input photo and photos in a small reference set of photo-sketch pairs.
The results suggest that “spatial structure” and “co-occurrence texture” are two generally applicable perceptual features in face sketch synthesis, and the first largest scale human-perception-based sketch database that can evaluate how well a metric consistent with human perception is evaluated.
A novel framework based on deep neural networks for face sketch synthesis from a photo that outperforms other state-of-the-arts methods, and can also generalize well to different test images.
A novel method to learn face sketch synthesis models by using unpaired data, termed sRender, which can generate multi-style sketches, and significantly outperforms existing unpaired image-to-image translation methods.
Generating photos satisfying multiple constraints finds broad utility in the content creation industry. A key hurdle to accomplishing this task is the need for paired data consisting of all modalities (i.e., constraints) and their corresponding output. Moreover, existing methods need retraining using paired data across all modalities to introduce a new condition. This paper proposes a solution to this problem based on denoising diffusion probabilistic models (DDPMs). Our motivation for choosing diffusion models over other generative models comes from the flexible internal structure of diffusion models. Since each sampling step in the DDPM follows a Gaussian distribution, we show that there exists a closed-form solution for generating an image given various constraints. Our method can unite multiple diffusion models trained on multiple sub-tasks and conquer the combined task through our proposed sampling strategy. We also introduce a novel reliability parameter that allows using different off-the-shelf diffusion models trained across various datasets during sampling time alone to guide it to the desired outcome satisfying multiple constraints. We perform experiments on various standard multimodal tasks to demonstrate the effectiveness of our approach. More details can be found at: https://nithin-gk.github.io/projectpages/Multidiff
It is shown that StyleSketch outperforms existing state‐of‐the‐art sketch extraction methods and few‐shot image adaptation methods for the task of extracting high‐resolution abstract face sketches and is extended by extending its use to other domains and exploring the possibility of semantic editing.
Adding a benchmark result helps the community track progress.