3260 papers • 126 benchmarks • 313 datasets
Scene Classification is a task in which scenes from photographs are categorically classified. Unlike object classification, which focuses on classifying prominent objects in the foreground, Scene Classification uses the layout of objects within the scene, in addition to the ambient context, for classification. Source: Scene classification with Convolutional Neural Networks
(Image credit: Papersgraph)
These leaderboards are used to track progress in scene-classification-11
Use these libraries to find scene-classification-11 models and implementations
No subtasks available.
A large-scale data set, termed “NWPU-RESISC45,” is proposed, which is a publicly available benchmark for REmote Sensing Image Scene Classification (RESISC), created by Northwestern Polytechnical University (NWPU).
The proposed MPN-COV conforms to a robust covariance estimator, very suitable for scenario of high dimension and small sample size, and can be regarded as Power-Euclidean metric between covariances, effectively exploiting their geometry.
The receptive field (RF) of CNNs is analysed and the importance of the RF to the generalization capability of the models is demonstrated, showing that very small or very large RFs can cause performance degradation, but deep models can be made to generalize well by carefully choosing an appropriate RF size within a certain range.
A novel data-driven approach that uses scene-graphs as intermediate representations for modeling the subjective risk of driving maneuvers that includes a Multi-Relation Graph Convolution Network, a Long-Short Term Memory Network, and attention layers is proposed.
A large-scale aerial image data set is constructed for remote sensing image caption and extensive experiments demonstrate that the content of theRemote sensing image can be completely described by generating language descriptions.
Comunicacio presentada a: 15th Sound and Music Computing Conference (SMC2018) Sonic crossing, celebrat a Limassol, Xipre, del 4 al 7 de Julyiol de 2018.
A multi-resolution CNN architecture that captures visual content and structure at multiple levels is proposed and two knowledge guided disambiguation techniques to deal with the problem of label ambiguity are designed.
Results indicate that transfer learning is a powerful strategy in such scenarios, but prototypical networks show promising results when one does not count with external or validation data.
The acoustic scene classification task of DCASE 2018 Challenge and the TUT Urban Acoustic Scenes 2018 dataset provided for the task are introduced, and the performance of a baseline system in the task is evaluated.
Adding a benchmark result helps the community track progress.