ingest_stac_search¶
zarr_datafusion_search.ingest_stac_search ¶
ingest_stac_search(url: str, *, store: Any | None = None, session: Any | None = None, intersects: str | dict | None = None, ids: str | list[str] | None = None, collections: str | list[str] | None = None, max_items: int | None = None, limit: int | None = None, bbox: list[float] | None = None, datetime: str | None = None, include: str | list[str] | None = None, exclude: str | list[str] | None = None, sortby: str | list[str] | None = None, filter: str | dict | None = None, query: dict | None = None, chunk_size: int = 1000, asset_hrefs: list[str] | None = None) -> Awaitable[int]
Ingest STAC API search results into a Zarr store.
Queries a STAC API, converts matching items to Arrow, and writes them as
1-D Zarr arrays under the /meta group. Supports both
zarr group-backed stores and Icechunk sessions.
Parameters:
-
url(str) –Base URL of the STAC API (e.g.
"https://earth-search.aws.element84.com/v1"). -
store(Any | None, default:None) –An obstore object store (e.g.
obstore.store.LocalStore,obstore.store.S3Store) pointing at the root of the Zarr store. Mutually exclusive withsession. -
session(Any | None, default:None) –An Icechunk writable session to write into. Mutually exclusive with
store. -
intersects(str | dict | None, default:None) –GeoJSON geometry (as a string or dict) to filter items by spatial intersection.
-
ids(str | list[str] | None, default:None) –One or more STAC item IDs to fetch.
-
collections(str | list[str] | None, default:None) –One or more collection IDs to search within.
-
max_items(int | None, default:None) –Maximum number of items to ingest. When
None, all matching items are fetched. -
limit(int | None, default:None) –Page size for the STAC API search request.
-
bbox(list[float] | None, default:None) –Bounding box filter as
[west, south, east, north]. -
datetime(str | None, default:None) –Datetime filter as a single datetime or a
/-separated range (e.g."2024-01-01/2024-06-01"). -
include(str | list[str] | None, default:None) –Fields to include in the response (STAC API Fields extension).
-
exclude(str | list[str] | None, default:None) –Fields to exclude from the response (STAC API Fields extension).
-
sortby(str | list[str] | None, default:None) –Sort order (STAC API Sort extension), e.g.
"+datetime"or"-eo:cloud_cover". -
filter(str | dict | None, default:None) –CQL2 filter as a text string or a CQL2-JSON dict (STAC API Filter extension).
-
query(dict | None, default:None) –Legacy STAC API query parameters.
-
chunk_size(int, default:1000) –Number of rows per Zarr chunk for newly created arrays. Ignored when appending to an existing store.
-
asset_hrefs(list[str] | None, default:None) –Asset keys (e.g.
["B01", "thumbnail"]) whosehrefvalues should be extracted and written as/meta/asset_{key}string arrays.
Returns:
-
Awaitable[int]–An awaitable that resolves to the number of rows written.