GSA GtCoP November 2024: Designing and maintaining STAC infrastructure for large‐scale applications
2024-11-19
The goal is for all providers of spatiotemporal assets (Imagery, SAR, Point Clouds, Data Cubes, Full Motion Video, etc) to expose their data as SpatioTemporal Asset Catalogs (STAC), so that new code doesn’t need to be written whenever a new data set or API is released.


{
"collections": [
{
"id": "sentinel1-grd",
"type": "Collection",
"links": [
{
"rel": "items",
"type": "application/geo+json",
"href": "https://dev.asdi-catalog.org/collections/sentinel1-grd/items"
},
{
"rel": "parent",
"type": "application/json",
"href": "https://dev.asdi-catalog.org/"
}
],
"title": "Sentinel-1 GRD",
"extent": {
"spatial": {
"bbox": [
[-180, -90, 180, 90]
]
},
"temporal": {
"interval": [
[
"2014-10-10T00:00:00Z",
null
]
]
}
}
}
]
}
pgstac origin storyHow can we optimize for lightning fast queries?
pgstac was born 
pgstacPostgreSQL schema and functions for Spatio-Temporal Asset Catalog (STAC)
strengths
weaknesses
titiler-pgstacpgstac database

The stactools-packages github organization is a home for datasets and tool packages using the SpatioTemporal Asset Catalog (STAC) specification.
href -> STAC itemeoAPIState-of-the-art services to enable data discovery, search, visualization, and access.
pystac_client for API operations and odc_stac for file i/ocrs = pyproj.CRS.from_string("epsg:5070")
# bounding box that surrounds the North Shore of Lake Superior
bbox = (373926, 2744693, 406338, 2765304)
aoi = box(*bbox)
# STAC items **always** store bounding box info in epsg:4326
transformer_4326 = pyproj.Transformer.from_crs(
crs_from=crs,
crs_to="epsg:4326",
always_xy=True,
)
bbox_4326 = transform(transformer_4326.transform, aoi).bounds
print(bbox_4326)(-91.05816506751593, 47.592675062538454, -90.61526491213358, 47.79503898873347)
client = pystac_client.Client.open(
"https://dev.asdi-catalog.org/"
)
collection_search = client.collection_search(q="DEM", bbox=bbox_4326)
search_results = {
collection.id: collection
for collection in collection_search.collections()
}
for collection_id, collection in search_results.items():
print(f"{collection_id}: {collection.description}\n")cop-dem-glo-30: The Copernicus DEM is a Digital Surface Model (DSM) which represents the surface of the Earth including buildings, infrastructure and vegetation. We provide two instances of Copernicus DEM named GLO-30 Public and GLO-90. GLO-90 provides worldwide coverage at 90 meters. GLO-30 Public provides limited worldwide coverage at 30 meters because a small subset of tiles covering specific countries are not yet released to the public by the Copernicus Programme. Note that in both cases ocean areas do not have tiles, there one can assume height values equal to zero. Data is provided as Cloud Optimized GeoTIFFs and comes from Copernicus DEM 2021 release.
odc.stac.loadxarray.Dataset with odc.stac.load<xarray.Dataset> Size: 3MB
Dimensions: (y: 688, x: 1081, time: 1)
Coordinates:
* y (y) float64 6kB 2.765e+06 2.765e+06 ... 2.745e+06 2.745e+06
* x (x) float64 9kB 3.739e+05 3.74e+05 ... 4.063e+05 4.063e+05
spatial_ref int32 4B 5070
* time (time) datetime64[ns] 8B 2021-04-22
Data variables:
data (time, y, x) float32 3MB dask.array<chunksize=(1, 688, 1081), meta=np.ndarray>
xarray APIxarray API for visualization and analysisclient = pystac_client.Client.open(
"https://services.terrascope.be/stac"
)
# sometimes necessary before running a collection search
client.add_conforms_to("COLLECTIONS")
collection_search = client.collection_search(
q="WorldCover", bbox=bbox_4326
)
stac_items = client.search(
collections=next(collection_search.collections()).id,
bbox=bbox_4326,
).item_collection()<xarray.Dataset> Size: 758kB
Dimensions: (y: 688, x: 1081, time: 1)
Coordinates:
* y (y) float64 6kB 2.765e+06 2.765e+06 ... 2.745e+06
* x (x) float64 9kB 3.739e+05 3.74e+05 ... 4.063e+05
spatial_ref int32 4B 5070
* time (time) datetime64[ns] 8B 2020-12-31T23:59:59
Data variables:
ESA_WORLDCOVER_10M_MAP (time, y, x) uint8 744kB dask.array<chunksize=(1, 688, 1081), meta=np.ndarray>
xarray.Dataset<xarray.Dataset> Size: 4MB
Dimensions: (y: 688, x: 1081)
Coordinates:
* y (y) float64 6kB 2.765e+06 2.765e+06 ... 2.745e+06 2.745e+06
* x (x) float64 9kB 3.739e+05 3.74e+05 ... 4.063e+05 4.063e+05
spatial_ref int32 4B 5070
time datetime64[ns] 8B 2020-12-31T23:59:59
Data variables:
land_cover (y, x) uint8 744kB dask.array<chunksize=(688, 1081), meta=np.ndarray>
elevation_m (y, x) float32 3MB dask.array<chunksize=(688, 1081), meta=np.ndarray>
