Home Research Papers Datasets State of the Art Pricing

Discover, visualize, and connect AI research papers. Explore the latest trends and insights in artificial intelligence research.

Product

Home
Research Papers
About

Support

Contact
Terms of Service
Privacy Policy

audio-5

Zero-shot Audio Captioning

3260 papers • 126 benchmarks • 313 datasets

Zero-shot audio captioning aims at automatically generating descriptive textual captions for audio content without any prior training for this task. Audio captioning is commonly concerned with ambient sounds, or sounds produced by a human performing an action.

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in zero-shot-audio-captioning-5

Trend

Dataset

Best Model

Actions

AudioCaps

Clotho

Libraries

Use these libraries to find zero-shot-audio-captioning-5 models and implementations

Datasets

AudioCaps

Clotho

Subtasks

No subtasks available.

Most implemented papers

Zero-shot audio captioning with audio-language model guidance and audio context keywords

Leonard Salewski, Zeynep Akata, A. S. Koepke, Stefan Fauth•Mon Nov 13 2023

This work proposes ZerAuCap, a novel framework for summarising general audio signals in a text caption without requiring task-specific training that achieves state-of-the-art results in zero-shot audio captioning on the AudioCaps and Clotho datasets.

Content

Introduction Benchmarks Datasets Subtasks Libraries Papers

Paper Graph

Adding a benchmark result helps the community track progress.

Zero-shot Audio Captioning | State-of-the-Art