Home Research Papers Datasets State of the Art Pricing

Discover, visualize, and connect AI research papers. Explore the latest trends and insights in artificial intelligence research.

Product

Home
Research Papers
About

Support

Contact
Terms of Service
Privacy Policy

computer-vision

Unconstrained Lip-synchronization

3260 papers • 126 benchmarks • 313 datasets

Given a video of an arbitrary person, and an arbitrary driving speech, the task is to generate a lip-synced video that matches the given speech. This task requires the approach to not be constrained by identity, voice, or language.

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in unconstrained-lip-synchronization

Trend

Dataset

Best Model

Actions

LRS2

LRW

LRS3

Libraries

Use these libraries to find unconstrained-lip-synchronization models and implementations

Datasets

LRW

LRS2

GLips

Subtasks

No subtasks available.

Most implemented papers

Towards Automatic Face-to-Face Translation

C. V. Jawahar, Prajwal K R, Abhishek Jha, Vinay P. Namboodiri, Rudrabha Mukhopadhyay, Jerin Philip•Mon Oct 14 2019

This work builds a working speech-to-speech translation system by bringing together multiple existing modules from speech and language and incorporates a novel visual module, LipGAN for generating realistic talking faces from the translated audio.

205

Content

Introduction Benchmarks Datasets Subtasks Libraries Papers

Paper Graph

You Said That?: Synthesising Talking Faces from Audio

Andrew Zisserman, Joon Son Chung, A. Jamaludin•Tue Feb 12 2019

An encoder–decoder convolutional neural network model is developed that uses a joint embedding of the face and audio to generate synthesised talking face video frames and proposed methods to re-dub videos by visually blending the generated face into the source video frame using a multi-stream CNN model.

197 0

Paper Graph

MARLIN: Masked Autoencoder for facial video Representation LearnINg

Munawar Hayat, Hamid Rezatofighi, Zhixi Cai, Kalin Stefanov, Abhinav Dhall, Shreya Ghosh, Jianfei Cai, Reza Haffari•Fri Nov 11 2022

The proposed framework, named MARLIN, is a facial video masked autoencoder that learns highly robust and generic facial embeddings from abundantly available non-annotated web crawled facial videos that can transfer across a variety of facial analysis tasks such as Facial Attribute Recognition (FAR), Facial Expression Recognition, DeepFake Detection, and Lip Synchronization.

94 0

Paper Graph

A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild

C. V. Jawahar, Prajwal K R, Vinay P. Namboodiri, Rudrabha Mukhopadhyay•Sat Aug 22 2020

This work investigates the problem of lip-syncing a talking face video of an arbitrary identity to match a target speech segment, and identifies key reasons pertaining to this and hence resolves them by learning from a powerful lip-sync discriminator.

1033 0

Paper Graph

Adding a benchmark result helps the community track progress.

Unconstrained Lip-synchronization | State-of-the-Art