Home Research Papers Datasets State of the Art Pricing

Discover, visualize, and connect AI research papers. Explore the latest trends and insights in artificial intelligence research.

Product

Home
Research Papers
Datasets
State of the Art

Support

Contact
Pricing
Terms of Service
Privacy Policy
About

Face to Face Translation | State-of-the-Art

multimodal-machine-translation

Face to Face Translation

3260 papers • 126 benchmarks • 313 datasets

Given a video of a person speaking in a source language, generate a video of the same person speaking in a target language.

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in multimodal-machine-translation

Trend

Dataset

Best Model

Actions

No benchmarks available.

Libraries

Use these libraries to find multimodal-machine-translation models and implementations

Datasets

No datasets available.

Subtasks

No subtasks available.

Most implemented papers

Pixel-Level Alignment of Facial Images for High Accuracy Recognition Using Ensemble of Patches

H. Mohammadzade, Amirhossein Sayyafan, Benyamin Ghojogh•Tue Feb 06 2018

The proposed method tries to perform a pixel alignment rather than eye alignment by mapping the geometry of faces to a reference face while keeping their own textures, and shows great improvement in comparison to eye-aligned recognition.

Content

Introduction Benchmarks Datasets Subtasks Libraries Papers

Paper Graph

Towards Automatic Face-to-Face Translation

C. V. Jawahar, Prajwal K R, Abhishek Jha, Vinay P. Namboodiri, Rudrabha Mukhopadhyay, Jerin Philip•Mon Oct 14 2019

This work builds a working speech-to-speech translation system by bringing together multiple existing modules from speech and language and incorporates a novel visual module, LipGAN for generating realistic talking faces from the translated audio.

205 0

Paper Graph

Recycle-GAN: Unsupervised Video Retargeting

Deva Ramanan, Yaser Sheikh, Shugao Ma, Aayush Bansal•Tue Aug 14 2018

This work introduces a data-driven approach for unsupervised video retargeting that translates content from one domain to another while preserving the style native to a domain, i.e., if contents of John Oliver's speech were to be transferred to Stephen Colbert, then the generated content/speech should be in Stephen Colbert's style.

313 0

Paper Graph

Triple consistency loss for pairing distributions in GAN-based face synthesis

M. Valstar, Enrique Sanchez•Wed Nov 07 2018

This work incorporates the triple consistency loss into the training of a new landmark-guided face to face synthesis, where, contrary to previous works, the generated images can simultaneously undergo a large transformation in both expression and pose.

25 0

Paper Graph

Adding a benchmark result helps the community track progress.