Speech

PromptSpeech

Introduced in PromptTTS: Controllable Text-to-Speech with Text Descriptions2022

About this Dataset

PromptSpeech is a dataset that consists of speech and the corresponding prompts. We synthesize speech with 5 different style factors (gender, pitch, speaking speed, volume, and emotion) from a commercial TTS API. The emotion factor has 5 categories and the gender factor has 2 categories.

Source: PromptTTS: Controllable Text-to-Speech with Text Descriptions

Image Source: https://arxiv.org/pdf/2211.12171v1.pdf

Source: PromptTTS: Controllable Text-to-Speech with Text Descriptions

Dataset Variants

PromptSpeech

Papers1

Prompttts: Controllable Text-To-Speech With Text Descriptions

A text-to-speech (TTS) system that takes a prompt with both style and content descriptions as input to synthesize the corresponding speech and experiments show that PromptTTS can generate speech with precise style control and high speech quality.

Tasks

EDIT

Speech Synthesis

Similar Datasets

TIMIT

WHAMR!

VoxForge

Statistics

Papers

1

Tasks

2

Introduced

2022

Modalities

Speech