The development of a novel emotional speech database (ESD) that addresses the increasing research need is motivated and the ESD database is now made available to the research community.
Authors
Kun Zhou
4 papers
Berrak Sisman
3 papers
Rui Liu
3 papers
Haizhou Li
8 papers
References201 items
1
Design and Development of Cost-Effective Child Surveillance System using Computer Vision Technology
2
Generative Adversarial Networks
3
Towards end-to-end F0 voice conversion based on Dual-GAN with convolutional wavelet kernels
4
Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability
5
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence Training
The Emotional Voices Database: Towards Controlling the Emotion Dimension in Voice Generation Systems
79
Sparse Coding of Pitch Contours with Deep Auto-Encoders
80
StarGAN-VC: non-parallel many-to-many Voice Conversion Using Star Generative Adversarial Networks
81
The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English
82
Investigating different representations for modeling and controlling multiple emotions in DNN-based speech synthesis
83
Speech emotion recognition
84
The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods
85
High-Quality Nonparallel Voice Conversion Based on Cycle-Consistent Adversarial Network
86
Augmented CycleGAN: Learning Many-to-Many Mappings from Unpaired Data
87
Emotional statistical parametric speech synthesis using LSTM-RNNs
88
Transformation of prosody in voice conversion
89
Sparse representation of phonetic features for voice conversion with and without parallel data
90
Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks
91
StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation
92
Introducing AmuS: The Amused Speech Database
93
A survey of multimodal sentiment analysis
94
An RNN-Based Quantized F0 Model with Multi-Tier Feedback Links for Text-to-Speech Synthesis
95
Emotional Voice Conversion with Adaptive Scales F0 Based on Wavelet Transform Using Limited Amount of Emotional Data
96
Attention is All you Need
97
Voice Conversion from Unaligned Corpora Using Variational Autoencoding Wasserstein Generative Adversarial Networks
98
An overview of voice conversion systems
99
The Voice Conversion Challenge 2016
100
Voice conversion from non-parallel corpora using variational auto-encoder
101
SUPERSEDED - CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit
102
Construction and analysis of phonetically and prosodically balanced emotional speech database
103
CHEAVD: a Chinese natural emotional audio–visual database
104
Deep Bidirectional LSTM Modeling of Timbre and Prosody for Emotional Voice Conversion
105
Representation Learning for Speech Emotion Recognition
106
Multilingual Speech Emotion Recognition System Based on a Three-Layer Model
107
Hot or Cold Anger? Verbal and Vocal Expression of Anger While Driving in a Simulated Anger-Provoking Scenario
108
WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications
109
Emotional voice conversion using deep neural networks with MCC and F0 features
110
Exemplar-based sparse representation of timbre and prosody for voice conversion
111
A Survey of Using Vocal Prosody to Convey Emotion in Robot Speech
112
Fundamental frequency modeling using wavelets for emotional voice conversion
113
Voice Conversion Using Deep Neural Networks With Layer-Wise Generative Training
114
Exemplar-based emotional voice conversion using non-negative matrix factorization
A three-layered model for expressive speech perception
142
Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory
143
The Vera am Mittag German audio-visual emotional speech database
144
Modelling and synthesising F0 contours with the discrete cosine transform
145
Pragmatics and Intonation
146
Springer handbook of speech processing
147
Information retrieval for music and motion
148
A Style Control Technique for HMM-Based Expressive Speech Synthesis
149
A system for transforming the emotion in speech: combining data-driven conversion techniques for prosody and voice quality
150
Expressing degree of activation in synthetic speech
151
Prosody conversion from neutral speech to emotional speech
152
Designing and Recording an Emotional Speech Database for Corpus Based Synthesis in Basque
153
The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology
154
Expressive speech: Production, perception and application to speech synthesis
155
Acoustic Modeling of Speaking Styles and Emotional Expressions in HMM-Based Speech Synthesis
156
Voice conversion for unknown speakers
157
GMM-based voice conversion applied to emotional speech synthesis
158
Impact of intended emotion intensity on cue utilization and decoding accuracy in vocal expression of emotion.
159
Spectral voice conversion for text-to-speech synthesis
160
Design, recording and verification of a danish emotional speech database
161
Acoustic profiles in vocal emotion expression.
162
Vocal Expression of Emotion: Acoustic Properties of Speech Are Associated With Emotional Intensity and Context
163
Mel-cepstral distance measure for objective speech quality assessment
164
An argument for basic emotions
165
Statistical analysis of bilingual speaker's speech for cross-language voice conversion.
166
Cross-language voice conversion
167
A circumplex model of affect.
168
Decoding of inconsistent communications.
169
Emotion and personality
170
Historical and Thematic Relations of Psychology to Other Sciences
171
Sequence-to-Sequence Emotional Voice Conversion With Strength Control
172
Vocal Communication of Emotion
173
2019b. Group sparse representation with wavenet vocoder
174
Voice conversion
175
MSP-IMPROV: An Acted Corpus of Dyadic Interactions to Study Emotion Perception
176
The LJ speech dataset. https://keithito.com/LJ-SpeechDataset
177
2016. World: a vocoder-based high-quality
178
Conference of the Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques (O-COCOSDA)
High-order sequence modeling using speaker-dependent recurrent temporal restricted boltzmann machines for voice conversion
181
Paralinguistics in speech and language - State-of-the-art and the challenge
182
Wavelets for intonation modeling in HMM speech synthesis
183
Multilingual Emotion Analysis and Recognition Based on Prosodic and Semantic Features
184
Intonation Conversion from Neutral to Expressive Speech
185
Speech prosody: A methodological review
186
Expressive Speech Synthesis: Past, Present, and Possible Futures
187
Multilevel parametric-base F0 model for speech synthesis
188
Visualizing Data using t-SNE
189
Design of Speech Corpus for Mandarin Text to Speech
190
Dynamic time warping
191
An Overview of Voice Conversion
192
Frame alignment method for cross-lingual voice conversion
193
The eNTERFACE'05 Audio-Visual Emotion Database
194
A database of German emotional speech
195
Towards emotional speech synthesis: a rule based approach
196
The CMU Arctic speech databases
197
MediaTeam Speech Corpus : a first large Finnish emotional speech database
198
Nonverbal aspects of oral communication
199
THE DICTIONARY OF AFFECT IN LANGUAGE
200
UvA-DARE (Digital Academic Repository) Do People Agree on How Positive Emotions are Expressed? A Survey of Four Emotions and Five Modalities across 11 Cultures