3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in music-classification-3
No benchmarks available.
Use these libraries to find music-classification-3 models and implementations
It is found that CRNN show a strong performance with respect to the number of parameter and training time, indicating the effectiveness of its hybrid structure in music feature extraction and feature summarisation.
The experiments show how deep architectures with sample-level filters improve the accuracy in music auto-tagging and they provide results comparable to previous state-of-the-art performances for the Magnatagatune dataset and Million Song Dataset.
This paper proposes to use a pre-trained convnet feature, a concatenated feature vector using the activations of feature maps of multiple layers in a trained convolutional network, and shows how it can serve as general-purpose music representation.
This work reviews recent text-based music retrieval systems using a proposed benchmark in two main aspects: input text representation and training objectives and enables a universal text-to-music retrieval system that achieves comparable retrieval performances in both tag- and sentence-level inputs.
This work introduces an encoder capturing word-level representations of speech for cross-task transfer learning and shows that the speech representation captured by the encoder through the pre-training is transferable across distinct speech processing tasks and datasets.
In comparison to state-of-the-art models that require fine-tuning, zero-shot CLaMP demonstrated comparable or superior performance on score-oriented datasets, surpassing the capabilities of previous models.
Experimental results indicate that pre-training the U-Nets with a music source separation objective can improve performance compared to both training the whole network from scratch and using the tail network as a standalone in two music classification tasks, music auto-tagging and music genre classification.
It is shown that in the deep layers of a 5-layer CNN, the features are learnt to capture textures, the patterns of continuous distributions, rather than shapes of lines.
This work investigates the zero-shot learning in the music domain and organizes two different setups of side information using human-labeled attribute information based on Free Music Archive and OpenMIC-2018 datasets and general word semantic information from Million Song Dataset and this http URL tag annotations.
This work proposes a music classification approach that aggregates multi-level and multi-scale features using pre-trained feature extractors trained in sample-level deep convolutional neural networks using raw waveforms.
Adding a benchmark result helps the community track progress.