3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in music-auto-tagging-1
Use these libraries to find music-auto-tagging-1 models and implementations
No subtasks available.
A consistent evaluation of different music tagging models on three datasets is conducted and reference results using common evaluation metrics are provided and all the models are evaluated with perturbed inputs to investigate the generalization capabilities concerning time stretch, pitch shift, dynamic range compression, and addition of white noise.
The experiments show how deep architectures with sample-level filters improve the accuracy in music auto-tagging and they provide results comparable to previous state-of-the-art performances for the Magnatagatune dataset and Million Song Dataset.
This paper improves the 1-D CNN architecture for music auto-tagging by adopting building blocks from state-of-the-art image classification models, ResNets and SENets, and adding multi-level feature aggregation to it, and comparing different combinations of the modules in building CNN architectures.
The state-of-art music auto-tagging model is extended to EDM subgenre classification, with the addition of two mid-level tempo-related feature representations, called the Fourier tempogram and autocorrelation tempogram, and two fusion strategies, early fusion and late fusion, are explored to aggregate the two types of tempograms.
Experimental results indicate that pre-training the U-Nets with a music source separation objective can improve performance compared to both training the whole network from scratch and using the tail network as a standalone in two music classification tasks, music auto-tagging and music genre classification.
This paper presents a two-stage learning model to effectively predict multiple labels from music audio, and achieves high performance on Magnatagatune, a popularly used dataset in music auto-tagging.
The experiments show that using the combination of multi-level and multi-scale features is highly effective in music auto-tagging and the proposed method outperforms the previous state-of-the-art methods on the MagnaTagATune dataset and the Million Song Dataset.
This work proposes deep content-user embedding model, a simple and intuitive architecture that combines the user-item interaction and music audio content and evaluates the model on music recommendation and music auto-tagging tasks.
There have been various music visualization schemes with different purposes, such as to visualize the emotion of music by selecting photos, to implement an active listening interface by visualizing structure or progress of music, or to create media art performance.
ClMR is introduced to the music domain and contributes a large chain of audio data augmentations to form a simple framework for self-supervised, contrastive learning of musical representations: CLMR, which works on raw time-domain music data and requires no labels to learn useful representations.
Adding a benchmark result helps the community track progress.