Skip to content

Source Discovery

lazymerge.scan_store

scan_store(root: Group) -> ScanIndex

lazymerge.ScanIndex

lazymerge.SourceEntry dataclass

lazymerge.query_datafusion_sources

query_datafusion_sources(
    store: Any,
    bbox_4326: tuple[float, float, float, float],
    sortby: str | None = None,
    sql_filter: str | None = None,
) -> list[SourceEntry]

Query the /meta group via DataFusion for sources intersecting bbox_4326.

Issues a spatial SQL query against the /meta columnar arrays. Only sources whose bbox (stored in EPSG:4326) intersects the query bbox are returned.

When the DataFusion schema includes transform_0..5 and shape_x/y columns, spatial metadata is read directly from the query results. Otherwise, it falls back to reading zarr group conventions for each matched source.

Args: store: An obstore-compatible object store (e.g. LocalStore, S3Store), an Icechunk Session, or a zarr Store backed by Icechunk. bbox_4326: Query bounding box in EPSG:4326 (xmin, ymin, xmax, ymax). sortby: Optional column name to sort results by (e.g. a datetime field). Controls the order in which sources are composited — earlier entries take priority for filling NaN pixels. sql_filter: Optional SQL expression appended as an AND clause to the spatial intersection query. For example: '"eo:cloud_cover" < 20' or '"datetime" > \\'2024-01-01\\''.

Returns: List of SourceEntry objects for matching sources. chunk_shape is set to a placeholder (0, 0) since it is resolved from the actual array.

lazymerge.query_temporal_groups

query_temporal_groups(
    store: Any,
    bbox_4326: tuple[float, float, float, float],
    grouper: Any,
    sql_filter: str | None = None,
) -> list[str]

Query distinct datetime values from /meta and bucket them into temporal groups.

Args: store: An obstore-compatible object store, Icechunk Session, or zarr Store. bbox_4326: Query bounding box in EPSG:4326 (xmin, ymin, xmax, ymax). grouper: A TemporalGrouper instance used to bucket datetime strings. sql_filter: Optional SQL expression appended as an AND clause.

Returns: Sorted list of unique group keys.

lazymerge.select_overview

select_overview(
    overviews: list[OverviewLevel], target_res: float, native_res: float
) -> OverviewLevel | None

Choose the coarsest overview whose resolution is <= target_res.

Picks the finest source data that avoids upsampling: the selected overview's pixel size is no larger than the output pixel size, so each output pixel samples at least as much original detail as it represents.

Args: overviews: Overview levels ordered finest to coarsest. target_res: Target pixel size in the source's native CRS units. native_res: Full-resolution pixel size.

Returns: An OverviewLevel, or None to use full resolution.