3260 papers • 126 benchmarks • 313 datasets
Active Speaker Localization (ASL) is the process of spatially localizing an active speaker (talker) in an environment using either audio, vision or both.
(Image credit: Papersgraph)
These leaderboards are used to track progress in active-speaker-localization-17
Use these libraries to find active-speaker-localization-17 models and implementations
No datasets available.
No subtasks available.
Adding a benchmark result helps the community track progress.