LJ-TTS: A Paired Real and Synthetic Speech Dataset for Single-Speaker TTS Analysis

Published in

Electronics(2025)

External Links:

TL;DR

The LJ-TTS dataset builds upon high-quality recordings of a single English speaker, alongside outputs generated by 11 state-of-the-art TTS models, including both autoregressive and non-autoregressive architectures, to support research in text-to-speech (TTS) synthesis and analysis.

Abstract

In this paper, we present LJ-TTS, a large-scale single-speaker dataset of real and synthetic speech designed to support research in text-to-speech (TTS) synthesis and analysis. The dataset builds upon high-quality recordings of a single English speaker, alongside outputs generated by 11 state-of-the-art TTS models, including both autoregressive and non-autoregressive architectures. By maintaining a controlled single-speaker setting, LJ-TTS enables precise comparison of speech characteristics across different generative models, isolating the effects of synthesis methods from speaker variability. Unlike multi-speaker datasets lacking alignment between real and synthetic samples, LJ-TTS provides exact utterance-level correspondence, allowing fine-grained analyses that are otherwise impractical. The dataset supports systematic evaluation of synthetic speech across multiple dimensions, including deepfake detection, source tracing, and phoneme-level analyses. LJ-TTS provides a standardized resource for benchmarking generative models, assessing the limits of current TTS systems, and developing robust detection and evaluation methods. The dataset is publicly available to the research community to foster reproducible and controlled studies in speech synthesis and synthetic speech detection.

Authors

P. Bestagini

3 papers

Luca Comanducci

2 papers

Stefano Tubaro

2 papers

LJ-TTS: A Paired Real and Synthetic Speech Dataset for Single-Speaker TTS Analysis

Published in

Electronics(2025)

External Links:

Generate Graph

TL;DR

Abstract

Authors

P. Bestagini

3 papers

Luca Comanducci

2 papers

Stefano Tubaro

2 papers

LJ-TTS: A Paired Real and Synthetic Speech Dataset for Single-Speaker TTS Analysis

TL;DR

Abstract

Authors

LJ-TTS: A Paired Real and Synthetic Speech Dataset for Single-Speaker TTS Analysis

TL;DR

Abstract

Authors

Field of Study

Journal Information

Name

Venue Information

Name

Type

URL