3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in music-information-retrieval-3
No benchmarks available.
Use these libraries to find music-information-retrieval-3 models and implementations
No subtasks available.
The results improve upon baseline works, verify the influence of the producer effect on classification performance and demonstrate the trade-offs between audio length and training set size.
This work performs an error analysis on three recent singing voice detection systems and designs novel methods to test the systems on multiple sets of internally curated and generated data to further examine the pitfalls, which are not clearly revealed with the current datasets.
A deep learning architecture called a Convolutional Sequence-to-Sequence model is presented to both move towards an end- to-end trainable OMR pipeline, and apply a learning process that trains on full sentences of sheet music instead of individually labeled symbols.
GiantMIDI-Piano is the largest classical piano MIDI dataset so far including the nationalities, the number and duration of works of composers, and the chroma, interval, trichord and tetrachord frequencies of six composers from different eras are shown to show that it can be used for musical analysis.
It is shown that ImageNet-Pretrained standard deep CNN models can be used as strong baseline networks for audio classification and qualitative results of what the CNNs learn from the spectrograms by visualizing the gradients are shown.
This work generates a low-dimensional representation from a short unit segment of audio, and couple this fingerprint with a fast maximum inner-product search, and presents a contrastive learning framework that derives from the segment-level search objective.
The basic principles and prominent works in deep learning for MIR are laid out and the network structures that have been successful in MIR problems are outlined to facilitate the selection of building blocks for the problems at hand.
A unique neural network approach inspired by a technique that has revolutionized the field of vision: pixel-wise image classification is presented, which is combined with cross entropy loss and pretraining of the CNN as an autoencoder on singing voice spectrograms.
A novel Convolutional Neural Network is proposed towards cover song identification that outperforms state-of-the-art methods on several public datasets with low time complexity.
This contribution investigates how input and target representations interplay with the amount of available training data in a music information retrieval setting and compares the standard mel-spectrogram inputs with a newly proposed representation, called Mel scattering.
Adding a benchmark result helps the community track progress.