3260 papers • 126 benchmarks • 313 datasets
Speech Recognition is the task of converting spoken language into text. It involves recognizing the words spoken in an audio recording and transcribing them into a written format. The goal is to accurately transcribe the speech in real-time or from recorded audio, taking into account factors such as accents, speaking speed, and background noise. ( Image credit: SpecAugment )
(Image credit: Papersgraph)
These leaderboards are used to track progress in speech-recognition-11
Use these libraries to find speech-recognition-11 models and implementations
This work presents SpecAugment, a simple data augmentation method for speech recognition that is applied directly to the feature inputs of a neural network (i.e., filter bank coefficients) and achieves state-of-the-art performance on the LibriSpeech 960h and Swichboard 300h tasks, outperforming all prior work.
This work presents a practical method for the federated learning of deep networks based on iterative model averaging, and conducts an extensive empirical evaluation, considering five different model architectures and four datasets.
Deep Speech, a state-of-the-art speech recognition system developed using end-to-end deep learning, outperforms previously published results on the widely studied Switchboard Hub5'00, achieving 16.0% error on the full test set.
This work proposes the convolution-augmented transformer for speech recognition, named Conformer, which significantly outperforms the previous Transformer and CNN based models achieving state-of-the-art accuracies.
The machine learning architecture of the Snips Voice Platform is presented, a software solution to perform Spoken Language Understanding on microprocessors typical of IoT devices that is fast and accurate while enforcing privacy by design, as no personal user data is ever collected.
It is shown for the first time that learning powerful representations from speech audio alone followed by fine-tuning on transcribed speech can outperform the best semi-supervised methods while being conceptually simpler.
Adding a benchmark result helps the community track progress.