SSL4EO-S12: A Large-Scale Multi-Modal, Multi-Temporal Dataset for Self-Supervised Learning in Earth Observation (2022-11-13T00:00:00.000000Z)

TL;DR

For EO applications it is demonstrated SSL4EO-S12 to succeed in self-supervised pre-training for a set of methods: MoCo-v2, DINO, MAE, and data2vec, and resulting models yield downstream performance close to, or surpassing accuracy measures of supervised learning.

Abstract

Self-supervised pre-training bears potential to generate expressive representations without human annotation. Most pre-training in Earth observation (EO) are based on ImageNet or medium-size, labeled remote sensing (RS) datasets. We share an unlabeled RS dataset SSL4EO-S12 (Self-Supervised Learning for Earth Observation - Sentinel-1/2) to assemble a large-scale, global, multimodal, and multi-seasonal corpus of satellite imagery from the ESA Sentinel-1 \&-2 satellite missions. For EO applications we demonstrate SSL4EO-S12 to succeed in self-supervised pre-training for a set of methods: MoCo-v2, DINO, MAE, and data2vec. Resulting models yield downstream performance close to, or surpassing accuracy measures of supervised learning. In addition, pre-training on SSL4EO-S12 excels compared to existing datasets. We make openly available the dataset, related source code, and pre-trained models at https://github.com/zhu-xlab/SSL4EO-S12.

Authors

Yi Wang

3 papers

C. Albrecht

3 papers

Nassim Ait Ali Braham

3 papers

TL;DR

Abstract

Authors

References54 items

EarthNets: Empowering AI in Earth Observation

Self-Supervised Learning in Remote Sensing: A review

Extreme Masking for Learning Instance and Distributed Visual Representations

Semantic-Aware Auto-Encoders for Self-supervised Representation Learning

Self-Supervised Vision Transformers for Joint SAR-Optical Representation Learning

data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language

iBOT: Image BERT Pre-Training with Online Tokenizer

Masked Autoencoders Are Scalable Vision Learners

Deep Multiview Learning for Hyperspectral Image Classification

Self-supervised Audiovisual Representation Learning for Remote Sensing Data

When Does Contrastive Visual Representation Learning Work?

Emerging Properties in Self-Supervised Vision Transformers

Self-Supervised Learning of Remote Sensing Scene Representations Using Contrastive Multiview Coding

An Empirical Study of Training Self-Supervised Vision Transformers

SAR Image Classification Using Contrastive Learning and Pseudo-Labels With Limited Data

Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data

Barlow Twins: Self-Supervised Learning via Redundancy Reduction

Self-Supervised Multisensor Change Detection

Geography-Aware Self-Supervised Learning

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Training General Representations for Remote Sensing Using in-Domain Knowledge

Unsupervised Learning of Visual Features by Contrasting Cluster Assignments

Self-Supervised Learning: Generative or Contrastive

Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning

Improved Baselines with Momentum Contrastive Learning

So2Sat LCZ42: A Benchmark Data Set for the Classification of Global Local Climate Zones [Software and Data Sets]

A Simple Framework for Contrastive Learning of Visual Representations

So2Sat LCZ42: A Benchmark Dataset for Global Local Climate Zones Classification

Momentum Contrast for Unsupervised Visual Representation Learning

SEN12MS - A Curated Dataset of Georeferenced Multi-Spectral Sentinel-1/2 Imagery for Deep Learning and Data Fusion

Bigearthnet: A Large-Scale Benchmark Archive for Remote Sensing Image Understanding

accepted

Representation Learning with Contrastive Predictive Coding

Urban Change Detection for Multispectral Earth Observation Using Convolutional Neural Networks

Datasheets for datasets

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Google Earth Engine: Planetary-scale geospatial analysis for everyone

EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification

Pyramid Scene Parsing Network

What makes ImageNet good for transfer learning?

Deep Residual Learning for Image Recognition

U-Net: Convolutional Networks for Biomedical Image Segmentation

Sentinel-2: ESA's Optical High-Resolution Mission for GMES Operational Services

GMES Sentinel-1 mission

ImageNet: A large-scale hierarchical image database

Global and local contrastive selfsupervised learning for semantic segmentation of HR remote sensing images

IEEE GRSS Data Fusion Contest. 2019

GRSS Data

For a 30-day interval around four reference dates (Mar 20, Jun 21, Sep 22, Dec 21)

Sample one location from a Gaussian distribution with a standard deviation of 50km around the city center

Check if a 2640m × 2640m image patch centered around that location has significant overlap with previous patches

Direction of the orbit (’ASCENDING’ or ’DESCENDING’) for the oldest image data in the product (the start of the product)

Fact sheet on the protection of personal data in the European Union

Preprocessing/cleaning/labeling

Field of Study

Journal Information

Name

Volume

Venue Information

Name

Type

URL

Alternate Names