Skip to content

Zarr

Zarr-Python is a Python library for reading and writing the Zarr file format for N-dimensional arrays. Zarr-Python is often used in conjunction with Xarray.

Zarr datasets are often very large and thus stored in object storage for cost effectiveness. As of Zarr-Python version 3.0.7 and later, you can use Obstore as a backend for Zarr-Python. For large queries this can be significantly faster than the default fsspec-based backend.

Example

Note

This example is also available on Github if you'd like to test it out locally.

import matplotlib.pyplot as plt
import pystac_client
import xarray as xr
from zarr.storage import ObjectStore

from obstore.auth.planetary_computer import PlanetaryComputerCredentialProvider
from obstore.store import AzureStore

# These first lines are specific to Zarr stored in the Microsoft Planetary
# Computer. We use pystac-client to find the metadata for this specific Zarr
# store.
catalog = pystac_client.Client.open(
    "https://planetarycomputer.microsoft.com/api/stac/v1/",
)
collection = catalog.get_collection("daymet-daily-hi")
asset = collection.assets["zarr-abfs"]

# We construct an AzureStore because this Zarr dataset is stored in Azure
# storage
azure_store = AzureStore(
    credential_provider=PlanetaryComputerCredentialProvider.from_asset(asset),
)

# Next we use the Zarr ObjectStorage adapter and pass it to xarray.
zarr_store = ObjectStore(azure_store, read_only=True)
ds = xr.open_dataset(zarr_store, consolidated=True, engine="zarr")

# And plot with matplotlib
fig, ax = plt.subplots(figsize=(12, 12))
ds.sel(time="2009")["tmax"].mean(dim="time").plot.imshow(ax=ax, cmap="inferno")
fig.savefig("zarr-example.png")

This plots: