Day 2 – Cloud-Native EO Data

Day 2 – Cloud-Native EO Data#

Cloud technology is driving the future of EO and Earth System Data. Geospatial data continues to grow in size and significane, which makes managing and analyzing it effectively resource demanding.

Cloud-native geospatial formats are designed for seamless, on-demand access—enabling users to search, filter, and process data directly in the cloud without needing to download it locally. This shift is powered by a thriving ecosystem of open-source tools that support scalable, remote analysis of geospatial information.

Cloud Object Storage#

In a cloud-native approach, geospatial data is best hosted in object storage systems that are accessible over the web—ideally through publicly accessible URLs. This model promotes open, scalable, and resilient data sharing. Major commercial providers of cloud object storage include:

Beyond the commercial space, many institutions and organizations also deploy private cloud infrastructure using S3-compatible storage solutions such as:

These systems allow users to implement cloud-native storage principles within local or consortium-based environments, combining the flexibility of object storage with data sovereignty and control.

Leveraging existing cloud storage infrastructure:

Reduces the burden on data providers to build and maintain their own hosting environments or custom APIs.
Enables them to focus on curating high-quality datasets while ensuring those datasets are reliably accessible and easily shared.
Helps mitigate the risks of hardware failures and data loss, offering a durable and cost-effective alternative to traditional local storage.

Traditional vs. Cloud-Native Geospatial Formats#

Category	Traditional Geospatial Formats	Cloud-Native Geospatial Formats
Raster Data	GeoTIFF, IMG, HDF, NetCDF	Cloud Optimized GeoTIFF (COG), Zarr
Vector Data	Shapefile, GeoJSON, KML	FlatGeobuf, Parquet (with GeoParquet), GeoArrow
Storage Requirements	Typically stored locally	Optimized for cloud object storage (e.g., S3)
Access Pattern	Requires full download before access	Supports partial reads and streaming access
Metadata Handling	Embedded or separate sidecar files	Designed for embedded metadata and efficient discovery
Performance	Slower in distributed or remote environments	Tuned for high-performance access in cloud workflows
Compatibility	Widely supported in legacy desktop tools	Increasing support in modern cloud-based tools
Scalability	Limited by local machine capabilities	Scalable with cloud compute and distributed frameworks
Examples of Usage	QGIS, ArcGIS Desktop	STAC, Dask, Xarray, Rasterio, GeoPandas (cloud configs)

Day 2 – Cloud-Native EO Data

Contents

Day 2 – Cloud-Native EO Data#

Cloud Object Storage#

Traditional vs. Cloud-Native Geospatial Formats#

Cloud Data Sharing and Discovery#