Home Research Papers Datasets State of the Art Pricing

Discover, visualize, and connect AI research papers. Explore the latest trends and insights in artificial intelligence research.

Product

Home
Research Papers
About

Support

Contact
Terms of Service
Privacy Policy

audio-7

Streaming Target Sound Extraction

3260 papers • 126 benchmarks • 313 datasets

This task is a variant of the Target Sound Extraction task, with the constraint of causal streaming inference. Aiming for an algorithmic latency of less than 20 ms, at each time step, streaming audio models operate on an input audio chunk of length less than 20 ms. The causal constraint means that the model only has the knowledge of past chunks and no future chunks.

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in streaming-target-sound-extraction-7

Trend

Dataset

Best Model

Actions

FSDSoundScapes

Libraries

Use these libraries to find streaming-target-sound-extraction-7 models and implementations

Datasets

FSDSoundScapes

Subtasks

No subtasks available.

Most implemented papers

Real-Time Target Sound Extraction

Takuya Yoshioka, Bandhav Veluri, Justin Chan, M. Itani, Tuochao Chen, Shyamnath Gollakota•Thu Nov 03 2022

Waveformer, an encoder-decoder architecture with a stack of dilated causal convolution layers as the encoder, and a transformer decoder layer as the decoder, is presented, the first neural network model to achieve real-time and streaming target sound extraction.

Content

Introduction Benchmarks Datasets Subtasks Libraries Papers

Paper Graph

Adding a benchmark result helps the community track progress.

Streaming Target Sound Extraction | State-of-the-Art