music-9

Music Captioning

3260 papers • 126 benchmarks • 313 datasets

This task has no description! Would you like to contribute one?

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in music-captioning-9

Trend

Dataset

Best Model

Actions

No benchmarks available.

Libraries

i

Use these libraries to find music-captioning-9 models and implementations

Datasets

Song Describer Dataset

Subtasks

No subtasks available.

Most implemented papers

Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and Captioning

Shansong Liu, Atin Sakkeer Hussain, Chenshuo Sun, Yin Shan•Mon Aug 21 2023

The experiments demonstrate that the proposed MU-LLaMA model, trained on the designed MusicQA dataset, achieves outstanding performance in both music question answering and music caption generation across various metrics, outperforming current state-of-the-art (SOTA) models in both fields and offering a promising advancement in the T2M-Gen research field.

Content

94

0

Paper Graph

ALCAP: Alignment-Augmented Music Captioner

Changyou Chen, Zihao He, Kristina Lerman, Weituo Hao, Weiyi Lu, Xuchen Song•Tue Dec 20 2022

This study introduces a method to systematically learn multimodal alignment between audio and lyrics through contrastive learning, which paves the way for models to achieve deeper cross-modal coherence, thereby producing high-quality captions.

1 0

Paper Graph

LP-MusicCaps: LLM-Based Pseudo Music Captioning

Juhan Nam, Seungheon Doh, Keunwoo Choi, Jongpil Lee•Sun Jul 30 2023

A systemic evaluation of the large-scale music captioning dataset with various quantitative evaluation metrics used in the field of natural language processing as well as human evaluation shows that the proposed approach outperforms the supervised baseline model.

116 0

Paper Graph

The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation

Seungheon Doh, Juhan Nam, Emmanouil Benetos, Benno Weck, Ilaria Manco, Elio Quinton, Minz Won, Yixiao Zhang, Dmitry Bodganov, Yusong Wu, Ke Chen, Philip Tovstogan, György Fazekas•Wed Nov 15 2023

This work introduces the Song Describer dataset (SDD), a new crowdsourced corpus of high-quality audio-caption pairs, designed for the evaluation of music-and-language models, and benchmark popular models on three key music- and-language tasks.

55 0

Paper Graph

MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response

Wenhao Huang, Wenhu Chen, Emmanouil Benetos, Ge Zhang, Zihao Deng, Yi Ma, Yudong Liu, Rongchen Guo•Thu Sep 14 2023

MusiLingo is a novel system for music caption generation and music-related query responses, bridging the gap between music audio and textual contexts and creating the MusicInstruct dataset from captions in the MusicCaps datasets, tailored for open-ended music inquiries.

47 0

Paper Graph

Adding a benchmark result helps the community track progress.