speech-10

Text-Independent Speaker Recognition

3260 papers • 126 benchmarks • 313 datasets

This task has no description! Would you like to contribute one?

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in text-independent-speaker-recognition-10

Trend

Dataset

Best Model

Actions

No benchmarks available.

Libraries

i

Use these libraries to find text-independent-speaker-recognition-10 models and implementations

Datasets

No datasets available.

Subtasks

No subtasks available.

Most implemented papers

Unified Hypersphere Embedding for Speaker Recognition

Dengxin Dai, Mahdi Hajibabaei•Sat Jul 21 2018

Results of experiments suggest that simple repetition and random time-reversion of utterances can reduce prediction errors by up to 18% and proposed logistic margin loss function leads to unified embeddings with state-of-the-art identification and competitive verification accuracies.

88

Content

0

Paper Graph

Frame-Level Speaker Embeddings for Text-Independent Speaker Recognition and Analysis of End-to-End Model

Suwon Shon, James R. Glass, Hao Tang•Tue Sep 11 2018

A Convolutional Neural Network (CNN) based speaker recognition model for extracting robust speaker embeddings is proposed and it is found that the networks are better at discriminating broad phonetic classes than individual phonemes.

88 0

Paper Graph

Three-Dimensional Lip Motion Network for Text-Independent Speaker Recognition

Jianrong Wang, Tong Wu, Shanyu Wang, Mei Yu, Qiang Fang, Ju Zhang, Li Liu•Mon Oct 12 2020

A novel end-to-end 3D lip motion Network (3LMNet) is presented by utilizing the sentence-level 3Dlip motion (S3DLM) to recognize speakers in both the text-independent and text-dependent contexts.

2 0

Paper Graph

Masked Proxy Loss for Text-Independent Speaker Verification

B. Raj, Jiachen Lian, Aiswarya Vinod Kumar, Hira Dhamyal, Rita Singh•Sun Nov 08 2020

A Masked Proxy (MP) loss which directly incorporates both proxy- based relationships and pair-based relationships is proposed to leverage the hardness of speaker pairs and state-of-the-art Equal Error Rate (EER) is proposed.

2 0

Paper Graph

SpeechNAS: Towards Better Trade-Off Between Latency and Accuracy for Large-Scale Speaker Verification

Ji Liu, Feng Deng, Xiaorui Wang, Wentao Zhu, Tianlong Kong, Shun Lu, Jixiang Li, Dawei Zhang, Sen Yang•Fri Sep 17 2021

The derived best neural network achieves an equal error rate (EER) of 1.02% on the standard test set of VoxCelebl, which surpasses previous TDNN based state-of-the-art approaches by a large margin.

6 0

Paper Graph

Adding a benchmark result helps the community track progress.