WorldStrat – An Open-source High Resolution Earth Image Dataset

By Michael Horton •  Updated: 11/19/22 •  3 min read

Experts have created a global open-source dataset of high-resolution images of Earth that is the most comprehensive and in-depth of its kind.

Nearly 10,000 km2 of free satellite images are included in the WorldStrat dataset, which includes polar ice caps, cities of all sizes, and nearly 10,000 km2 of agricultural, grassland, and forest land uses.

Locations in the Global South and those in need of humanitarian aid are included in the dataset. These regions are frequently underrepresented in satellite imagery, which is typically gathered for commercial gain and disproportionately highlights wealthier regions.

According to the scientists, the collection enables global terrain analysis to address global challenges such as responding to natural and man-made disasters, managing natural resources, and urban planning.

Unlocking High-resolution Earth Imagery

worldstrat dataset construction summary graphic

Construction and classes of the WorldStrat dataset.
Credit: arXiv (2022). DOI: 10.48550/arxiv.2207.06418

WorldStrat development began in 2021, and it has been downloaded over 3,000 times since its release in June 2022.

“The combination of high-resolution commercial imagery and machine learning has huge potential to enable planetwide analyses, which could help to tackle all kinds of global challenges. The problem is that commercial data are often locked behind a paywall,”

said project lead, Dr. Julien Cornebise, from University College London’s Computer Science department.

The project was made possible by the Third Party Missions (TPM) program of the European Space Agency, which made data that would have otherwise been very expensive available for free. TPM agreements now in force cover over 60 instruments on more than 50 missions.

SPOT 6 And SPOT 7 Satellites

The team made use of information from the ESA-commissioned Airbus SPOT 6 and SPOT 7 satellites, which were launched in 2012 and 2014, respectively. Each pixel on the imagery from the satellites corresponds to a 1.5 m by 1.5 m area on the ground. In other words, the resolution of the imagery can reach up to 1.5 m per pixel.

Around 4,000 highly detailed images from the SPOT satellites were used by the scientists.

Even though these images have a high spatial resolution, they have a low temporal resolution, which means that each satellite does not revisit and recapture each site on a regular basis. This is because the satellite images were designed to be used for specific commercial applications rather than long-term analyses.

To counteract this, the team also used freely available, lower-resolution Copernicus Sentinel-2 satellite images. These have a higher temporal resolution, which means they were taken at more regular time intervals every five days. They matched each SPOT image with 16 Copernicus Sentinel-2 images, totalling around 64,000.

Machine Learning Toolbox

The dataset was created by the researchers to aid in the development of machine learning applications that would extend and improve it, such as improving image resolution. So that more applications could be made, the scientists created a toolbox for artificial intelligence as well as the full source code. This lets developers copy, expand, and change the work.

“Thousands of data users from around the world have already downloaded WorldStrat — and we look forward to seeing the ways in which they extend and improve it, using machine learning techniques,”

said Dr. Cornebise.

References:

Julien Cornebise, Ivan Oršolić, & Freddie Kalaitzis. Open High-Resolution Satellite Imagery: The WorldStrat Dataset — With Application to Super-Resolution. arXiv (2022) doi: 10.48550/ARXIV.2207.06418

Julien Cornebise, Ivan Oršolić, & Freddie Kalaitzis. (2022). The WorldStrat Dataset: Open High-Resolution Satellite Imagery With Paired Multi-Temporal Low-Resolution [Data set]. Zenodo. 6810792