3260 papers • 126 benchmarks • 313 datasets
Given multichannel audio input, a sound event detection and localization (SELD) system outputs a temporal activation track for each of the target sound classes, along with one or more corresponding spatial trajectories when the track indicates activity. This results in a spatio-temporal characterization of the acoustic scene that can be used in a wide range of machine cognition tasks, such as inference on the type of environment, self-localization, navigation without visual input or with occluded targets, tracking of specific types of sound sources, smart-home applications, scene visualization systems, and audio surveillance, among others.
(Image credit: Papersgraph)
These leaderboards are used to track progress in sound-event-localization-and-detection-18
Use these libraries to find sound-event-localization-and-detection-18 models and implementations
No datasets available.
No subtasks available.
Adding a benchmark result helps the community track progress.