open¶

lazycogs.open ¶

open(
    href: str,
    *,
    datetime: str | None = None,
    bbox: tuple[float, float, float, float],
    crs: str | CRS,
    resolution: float,
    filter: str | dict[str, Any] | None = None,
    ids: list[str] | None = None,
    bands: list[str] | None = None,
    chunks: dict[str, int] | None = None,
    sortby: str | list[str | dict[str, str]] | None = None,
    nodata: float | None = None,
    dtype: str | dtype | None = None,
    mosaic_method: type[MosaicMethodBase] | None = None,
    time_period: str = "P1D",
    store: Store | None = None,
    max_concurrent_reads: int = 32,
    path_from_href: Callable[[str], str] | None = None,
    duckdb_client: DuckdbClient | None = None,
) -> DataArray

Open a mosaic of STAC items as a lazy (band, time, y, x) DataArray.

href must be a path to a geoparquet file (.parquet or .geoparquet) or, when duckdb_client is provided, to a hive-partitioned parquet directory.

Parameters:

Name	Type	Description	Default
`href`	`str`	Path to a geoparquet file (`.parquet` or `.geoparquet`) or a hive-partitioned parquet directory when duckdb_client is provided with `use_hive_partitioning=True`.	required
`datetime`	`str \| None`	RFC 3339 datetime or range (e.g. `"2023-01-01/2023-12-31"`) used to pre-filter items from the parquet.	`None`
`bbox`	`tuple[float, float, float, float]`	`(minx, miny, maxx, maxy)` in the target `crs`.	required
`crs`	`str \| CRS`	Target output CRS.	required
`resolution`	`float`	Output pixel size in `crs` units.	required
`filter`	`str \| dict[str, Any] \| None`	CQL2 filter expression (text string or JSON dict) forwarded to DuckDB queries, e.g. `"eo:cloud_cover < 20"`.	`None`
`ids`	`list[str] \| None`	STAC item IDs to restrict the search to.	`None`
`bands`	`list[str] \| None`	Asset keys to include. If `None`, auto-detected from the first matching item.	`None`
`chunks`	`dict[str, int] \| None`	Chunk sizes passed to `DataArray.chunk()`. If `None` (default), returns a `LazilyIndexedArray`-backed DataArray where only the requested pixels are fetched on each access — ideal for point or small-region queries. Pass an explicit dict to convert to a dask-backed array for parallel computation over larger regions.	`None`
`sortby`	`str \| list[str \| dict[str, str]] \| None`	Sort keys forwarded to DuckDB queries.	`None`
`nodata`	`float \| None`	No-data fill value for output arrays.	`None`
`dtype`	`str \| dtype \| None`	Output array dtype. Defaults to `float32`.	`None`
`mosaic_method`	`type[MosaicMethodBase] \| None`	Mosaic method class (not instance) to use. Defaults to :class:`~lazycogs._mosaic_methods.FirstMethod`.	`None`
`time_period`	`str`	ISO 8601 duration string controlling how items are grouped into time steps. Supported forms: `PnD` (days), `P1W` (ISO calendar week), `P1M` (calendar month), `P1Y` (calendar year). Defaults to `"P1D"` (one step per calendar day), which preserves the previous behaviour. Multi-day windows such as `"P16D"` are aligned to an epoch of 2000-01-01.	`'P1D'`
`store`	`Store \| None`	Pre-configured :class:`async_geotiff.Store` accepted by `GeoTIFF.open` to use for all asset reads. Useful when credentials, custom endpoints, or non-default options are needed without relying on automatic store resolution from each HREF. When `None` (default), each asset URL is parsed to create or reuse a per-thread cached obstore-backed store.	`None`
`max_concurrent_reads`	`int`	Maximum number of COG reads to run concurrently per chunk. Items are processed in batches of this size, which bounds peak in-flight memory when a chunk overlaps many files. Methods that support early exit (e.g. the default :class:`~lazycogs._mosaic_methods.FirstMethod`) will stop reading once every output pixel is filled, so lower values also reduce unnecessary I/O on dense datasets. Defaults to 32.	`32`
`path_from_href`	`Callable[[str], str] \| None`	Optional callable `(href: str) -> str` that extracts the object path from an asset HREF. When provided, it replaces the default `urlparse`-based extraction used in :func:`~lazycogs._store.resolve`. Most useful when combined with a custom `store` whose root does not align with the URL path structure of the asset HREFs. Example — NASA LPDAAC proxy https url for S3 asset:: `from obstore.store import S3Store from urllib.parse import urlparse store = S3Store(bucket="lp-prod-protected", ...) def strip_bucket(href: str) -> str: # href: https://data.lpdaac.earthdatacloud.nasa.gov/ # lp-prod-protected/path/to/file.tif # store is rooted at the bucket, so the path is # just path/to/file.tif return ( urlparse(href).path.lstrip("/").removeprefix("lp-prod-protected/") ) da = lazycogs.open( "items.parquet", ..., store=store, path_from_href=strip_bucket )`	`None`
`duckdb_client`	`DuckdbClient \| None`	Optional `DuckdbClient` instance. When `None` (default), a plain `DuckdbClient()` is created, which is equivalent to the previous `rustac.search_sync` behaviour. Pass a custom client to enable features such as hive-partitioned datasets:: `import rustac, lazycogs client = DuckdbClient(use_hive_partitioning=True) da = lazycogs.open( "s3://bucket/stac/", duckdb_client=client, bbox=..., crs=..., resolution=..., )`	`None`

Returns:

Type	Description
`DataArray`	Lazy `xr.DataArray` with dimensions `(band, time, y, x)`.

Raises:

Type	Description
`ValueError`	If `href` is not a `.parquet` or `.geoparquet` file and no duckdb_client is provided, if no matching items are found, or if `time_period` is not a recognised ISO 8601 duration.