Home Research Papers Datasets State of the Art Pricing

Discover, visualize, and connect AI research papers. Explore the latest trends and insights in artificial intelligence research.

Product

Home
Research Papers
About

Support

Contact
Terms of Service
Privacy Policy

© 2026 Papersgraph. All rights reserved.

audio-1

Acoustic Scene Classification

3260 papers • 126 benchmarks • 313 datasets

The goal of acoustic scene classification is to classify a test recording into one of the provided predefined classes that characterizes the environment in which it was recorded. Source: DCASE 2019 Source: DCASE 2018

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in acoustic-scene-classification-1

Trend

Dataset

Best Model

Actions

DCASE 2019 Mobile

DCASE 2019 Mobile

TAU Urban Acoustic Scenes 2019

TAU Urban Acoustic Scenes 2019

Libraries

i

Use these libraries to find acoustic-scene-classification-1 models and implementations

Datasets

DCASE 2016

MTG-Jamendo

TUT Acoustic Scenes 2017

TUT Acoustic Scenes 2017

TAU Urban Acoustic Scenes 2019

TAU Urban Acoustic Scenes 2019

DCASE 2013

DCASE 2019 Mobile

DCASE 2019 Mobile

Subtasks

No subtasks available.

Most implemented papers

The Receptive Field as a Regularizer in Deep Convolutional Neural Networks for Acoustic Scene Classification

Hamid Eghbalzadeh, G. Widmer, Khaled Koutini, Matthias Dorfer•Tue Jul 02 2019

The receptive field (RF) of CNNs is analysed and the importance of the RF to the generalization capability of the models is demonstrated, showing that very small or very large RFs can cause performance degradation, but deep models can be made to generalize well by carefully choosing an appropriate RF size within a certain range.

92

Content

Introduction Benchmarks Datasets Subtasks Libraries Papers

TUT Urban Acoustic Scenes 2018

TUT Urban Acoustic Scenes 2018

CochlScene

CochlScene

TUT Acoustic Scenes 2017

TUT Acoustic Scenes 2017

LITIS Rouen

LITIS Rouen

TUT Urban Acoustic Scenes 2018

TUT Urban Acoustic Scenes 2018

CochlScene

TUT Sound Events 2018

TUT Sound Events 2018

0

A Simple Fusion of Deep and Shallow Learning for Acoustic Scene Classification

Eduardo Fonseca, Xavier Serra, Rong Gong•Mon Jun 18 2018

Comunicacio presentada a: 15th Sound and Music Computing Conference (SMC2018) Sonic crossing, celebrat a Limassol, Xipre, del 4 al 7 de Julyiol de 2018.

12 0

A multi-device dataset for urban acoustic scene classification

Tuomas Virtanen, A. Mesaros, Toni Heittola•Tue Jul 24 2018

The acoustic scene classification task of DCASE 2018 Challenge and the TUT Urban Acoustic Scenes 2018 dataset provided for the task are introduced, and the performance of a baseline system in the task is evaluated.

418 0

Training Neural Audio Classifiers with Few Data

Xavier Serra, Jordi Pons, J. Serrà•Tue Oct 23 2018

Results indicate that transfer learning is a powerful strategy in such scenarios, but prototypical networks show promising results when one does not count with external or validation data.

66 0

Receptive-field-regularized CNN variants for acoustic scene classification

Hamid Eghbalzadeh, G. Widmer, Khaled Koutini•Wed Sep 04 2019

This paper performs a systematic investigation of different RF configuration for various CNN architectures on the DCASE 2019 Task 1.A dataset, introduces Frequency Aware CNNs to compensate for the lack of frequency information caused by the restricted RF, and investigates if and in what RF ranges they yield additional improvement.

32 0

Efficient Training of Audio Transformers with Patchout

Hamid Eghbalzadeh, G. Widmer, Khaled Koutini, Jan Schlüter•Sun Oct 10 2021

This work proposes a novel method to optimize and regularize transformers on audio spectrograms that achieves a new state-of-the-art performance on Audioset and can be trained on a single consumer-grade GPU.

358 0

SELD-TCN: Sound Event Localization & Detection via Temporal Convolutional Networks

Bin Yang, Sherif Abdulatif, Karim Guirguis, Christoph Schorn, A. Guntoro•Mon Mar 02 2020

The proposed framework (SELD-TCN) outperforms the state-of-the-art SELDnet performance on four different datasets and achieves 4x faster training time per epoch and 40x faster inference time on an ordinary graphics processing unit (GPU).

69 0

Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models

Xiaohuan Zhou, Zhijie Yan, Yunfei Chu, Jin Xu, Qian Yang, Shiliang Zhang, Chang Zhou, Jingren Zhou•Mon Nov 13 2023

The Qwen-Audio model, a multi-task training framework that achieves impressive performance across diverse benchmark tasks without requiring any task-specific fine-tuning, surpassing its counterparts.

620 0

Classifying Variable-Length Audio Files with All-Convolutional Networks and Masked Global Pooling

Huy Phan, A. Mertins, L. Hertel•Sun Jul 10 2016

A deep all-convolutional neural network with masked global pooling to perform single-label classification for acoustic scene classification and multi- label classification for domestic audio tagging in the DCASE-2016 contest improves the baselines by a relative amount of 17% and 19%, respectively.

19 0

Unsupervised adversarial domain adaptation for acoustic scene classification

Dmitriy Serdyuk, Tuomas Virtanen, Emre Çakir, K. Drossos, Shayan Gharib•Thu Aug 16 2018

The first method of unsupervised adversarial domain adaptation for acoustic scene classification is presented, which employs a model pre-trained on data from one set of conditions and by using data from other set of Conditions, which adapt the model in order that its output cannot be used for classifying the set of condition that input data belong to.

44 0

Adding a benchmark result helps the community track progress.

Acoustic Scene Classification | State-of-the-Art