Home Research Papers Datasets State of the Art Pricing

Discover, visualize, and connect AI research papers. Explore the latest trends and insights in artificial intelligence research.

Product

Home
Research Papers
About

Support

Contact
Terms of Service
Privacy Policy

© 2026 Papersgraph. All rights reserved.

music

Music Transcription

3260 papers • 126 benchmarks • 313 datasets

Music transcription is the task of converting an acoustic musical signal into some form of music notation. ( Image credit: ISMIR 2015 Tutorial - Automatic Music Transcription )

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in music-transcription

Trend

Dataset

Best Model

Actions

MAESTRO

MAESTRO

MusicNet

MusicNet

Slakh2100

Slakh2100

Libraries

i

Use these libraries to find music-transcription models and implementations

BShakhovsky/PolyphonicPianoTranscri…

2 papers 227

Datasets

MAESTRO

MusicNet

URMP

Music21

Slakh2100

Subtasks

Multi-instrument Music Transcription

Most implemented papers

Deep Complex Networks

Dmitriy Serdyuk, C. Pal, Yoshua Bengio, C. Trabelsi, O. Bilaniuk, Sandeep Subramanian, J. F. Santos, Soroush Mehri, Negar Rostamzadeh•Fri May 26 2017

This work relies on complex convolutions and present algorithms for complex batch-normalization, complex weight initialization strategies for complex-valued neural nets and uses them in experiments with end-to-end training schemes and demonstrates that such complex- valued models are competitive with their real-valued counterparts.

922

Content

Introduction Benchmarks Datasets Subtasks Libraries Papers

MAPS

MAPS

URMP

URMP

SMD Piano

SMD Piano

9552nZ/SmartSheetMusic

2 papers 26

MAPS

MAPS

CocoChorales

JS Fake Chorales

JS Fake Chorales

YourMT3 Dataset

YourMT3 Dataset

0

Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset

S. Dieleman, Erich Elsen, Jesse Engel, Curtis Hawthorne, Andriy Stasyuk, Adam Roberts, Ian Simon, Cheng-Zhi Anna Huang, D. Eck•Sun Oct 28 2018

By using notes as an intermediate representation, a suite of models capable of transcribing, composing, and synthesizing audio waveforms with coherent musical structure on timescales spanning six orders of magnitude are trained, a process the authors call Wave2Midi2Wave.

532 0

High-Resolution Piano Transcription With Pedals by Regressing Onset and Offset Times

Qiuqiang Kong, Bochen Li, Yuxuan Wang, Xuchen Song, Yuan Wan•Sun Oct 04 2020

A high-resolution AMT system trained by regressing precise onset and offset times of piano notes and pedal events is proposed, and it is shown that the system is robust to the misaligned onset andoffset labels compared to previous systems.

145 0

Learning Features of Music from Scratch

Zaïd Harchaoui, S. Kakade, John Thickstun•Thu Nov 03 2016

A multi-label classification task to predict notes in musical recordings is defined, along with an evaluation protocol, and several machine learning architectures for this task are benchmarked.

223 0

Music transcription modelling and composition using deep learning

J. F. Santos, I. Korshunova, Bob L. Sturm, Oded Ben-Tal•Thu Apr 28 2016

This work builds and train LSTM networks using approximately 23,000 music transcriptions expressed with a high-level vocabulary (ABC notation), and uses them to generate new transcriptions to create music transcription models useful in particular contexts of music composition.

184 0

The Effect of Spectrogram Reconstruction on Automatic Music Transcription: An Alternative Approach to Improve Transcription Accuracy

Emmanouil Benetos, Yin-Jyun Luo, Dorien Herremans, K. Cheuk•Mon Oct 19 2020

The experiments show that adding the reconstruction loss can generally improve the note-level transcription accuracy when compared to the same model without the reconstruction part, and boost the frame-level precision to be higher than the state-of-the-art models.

22 0

Residual Shuffle-Exchange Networks for Fast Processing of Long Sequences

Andis Draguns, Emils Ozolins, A. Sostaks, Matiss Apinis, Kārlis Freivalds•Sun Apr 05 2020

This paper presents a simple and lightweight variant of the Shuffle-Exchange network, which is based on a residual network employing GELU and Layer Normalization and achieves state-of-the-art performance on the MusicNet dataset for music transcription while being efficient in the number of parameters.

10 0

Sequence-to-Sequence Piano Transcription with Transformers

Jesse Engel, Curtis Hawthorne, Ian Simon, Ethan Manilow, Rigel Swavely•Sun Jul 18 2021

It is shown that equivalent performance can be achieved using a generic encoder-decoder Transformer with standard decoding methods, and it is demonstrated that the model can learn to translate spectrogram inputs directly to MIDI-like output events for several transcription tasks.

98 0

Scoring Time Intervals Using Non-Hierarchical Transformer for Automatic Piano Transcription

Yujia Yan, Zhiyao Duan•Sun Apr 14 2024

This paper introduces a simple method for scoring intervals using scaled inner product operations that resemble how attention scoring is done in transformers, and demonstrates that an encoder-only structured non-hierarchical transformer backbone is capable of transcribing piano notes and pedals with high accuracy and time precision.

14 0

MT3: Multi-Task Multitrack Music Transcription

Jesse Engel, Curtis Hawthorne, Ian Simon, Ethan Manilow, Josh Gardner•Wed Nov 03 2021

This work demonstrates that a general-purpose Transformer model can perform multi-task AMT, jointly transcribing arbitrary combinations of musical instruments across several transcription datasets, and shows this unified training framework achieves high-quality transcription results across a range of datasets.

121 0

Adding a benchmark result helps the community track progress.

Music Transcription | State-of-the-Art