Skip to content

Utilities

lazycogs.align_bbox

align_bbox(
    affine: Affine | Sequence[float], bbox: tuple[float, float, float, float]
) -> tuple[float, float, float, float]

Snap a bounding box to the pixel grid defined by an affine transform.

Expands the bbox outward so that all four edges fall exactly on a grid line. Useful for aligning an AOI to the native grid of a COG collection (e.g. from a STAC item's proj:transform property) before calling :func:lazycogs.open.

Parameters:

Name Type Description Default
affine Affine | Sequence[float]

Affine transform in row-major order, either 6-element (pixel_w, 0, x_origin, 0, pixel_h, y_origin) or 9-element (pixel_w, 0, x_origin, 0, pixel_h, y_origin, 0, 0, 1). Accepts an :class:affine.Affine object or the list stored in a STAC item's proj:transform property.

required
bbox tuple[float, float, float, float]

(minx, miny, maxx, maxy) in the same CRS as the transform.

required

Returns:

Type Description
float

(minx, miny, maxx, maxy) snapped to the nearest enclosing grid

float

lines.

lazycogs.store_for

store_for(
    href: str,
    *,
    asset: str | None = None,
    duckdb_client: DuckdbClient | None = None,
    **kwargs: object,
) -> ObjectStore

Construct an ObjectStore by inspecting a geoparquet STAC items file.

Reads one sample item from href, derives the store root URL from a data asset HREF, and constructs an ObjectStore with obstore's own environment-based credential discovery. If the item carries STAC Storage Extension metadata (v1.0.0 or v2.0.0), region and requester_pays are also inferred automatically.

Caller-supplied kwargs override all inferred values; pass skip_signature=True for public buckets that do not require signed requests, or supply explicit credentials.

Parameters:

Name Type Description Default
href str

Path to a geoparquet file or hive-partitioned parquet directory.

required
asset str | None

Asset key to inspect when choosing a representative asset. Defaults to the first data asset (role "data" or media type "image/tiff"), falling back to the first asset in the item.

None
duckdb_client DuckdbClient | None

Optional DuckdbClient instance. When None (default), a plain DuckdbClient() is used. Pass a custom client to query hive-partitioned datasets.

None
**kwargs object

Forwarded to :func:obstore.store.from_url, overriding any inferred values.

{}

Returns:

Type Description
ObjectStore

A freshly constructed ObjectStore (not cached).

Raises:

Type Description
ValueError

If no STAC items are found in href.

KeyError

If asset is specified but not present in the item.

lazycogs.set_reproject_workers

set_reproject_workers(n: int) -> None

Set the number of threads each chunk's event loop uses for reprojection.

Each chunk read creates a fresh asyncio event loop with its own dedicated ThreadPoolExecutor bounded to n workers. Dask tasks do not compete for a shared pool — each task gets n independent reprojection threads. Total reprojection threads at any moment is at most n x dask_worker_count.

Reprojection is memory-bandwidth-bound rather than compute-bound, so values above 4 typically offer no benefit and can hurt throughput due to memory contention. The default is min(os.cpu_count(), 4).

To improve overall throughput, prefer adding time or band parallelism via dask (chunks={"time": 1}) over raising this value.

Parameters:

Name Type Description Default
n int

Number of worker threads per event loop. Must be >= 1.

required

Raises:

Type Description
ValueError

If n is less than 1.

lazycogs.ExplainPlan dataclass

Complete dry-run read plan for a lazycogs query.

Attributes:

Name Type Description
href str

Path to the source geoparquet file.

crs str

String representation of the output CRS.

resolution float

Output pixel size in CRS units.

bands list[str]

Ordered list of band names included in the plan.

time_coords list[datetime64]

Time coordinate values for all explained time steps.

dst_width int

Output grid width in pixels (for the current DataArray extent).

dst_height int

Output grid height in pixels (for the current DataArray extent).

chunk_width int

Spatial chunk width in pixels.

chunk_height int

Spatial chunk height in pixels.

chunk_reads list[ChunkRead]

One entry per (band, time step, spatial tile).

fetch_headers bool

Whether COG headers were opened to populate overview and window fields on each :class:ItemRead.

empty_chunk_count property

empty_chunk_count: int

Number of chunks with zero matching COG files.

total_chunk_reads property

total_chunk_reads: int

Total number of (band, time, spatial) chunk reads.

total_cog_reads property

total_cog_reads: int

Total number of COG file reads across all chunks.

__repr__

__repr__() -> str

Return a compact single-line summary.

summary

summary() -> str

Return a multi-line human-readable summary of the explain plan.

to_dataframe

to_dataframe() -> DataFrame

Return a DataFrame with one row per (chunk x item) combination.

Empty chunks contribute one row with item fields set to None. When fetch_headers=False, the overview and window columns are all None.

Returns:

Type Description
DataFrame

A pandas.DataFrame with columns for chunk metadata, item

DataFrame

metadata, and (when available) COG header details.

Raises:

Type Description
ImportError

If pandas is not installed.

lazycogs.ChunkRead dataclass

All reads required for one (band, time step, spatial tile).

Attributes:

Name Type Description
band str

Asset key for this chunk.

time_index int

Index of this time step in the full time axis.

date_filter str

rustac-compatible datetime filter string for this time step.

time_coord datetime64

Coordinate value for this time step.

chunk_row int

Tile row index within the spatial grid (0-indexed).

chunk_col int

Tile column index within the spatial grid (0-indexed).

chunk_affine Affine

Affine transform of the tile (top-left origin).

chunk_width int

Tile width in pixels.

chunk_height int

Tile height in pixels.

cog_reads list[CogRead]

Per-COG read details.

n_cog_reads int

Number of COG files matched (derived from cog_reads).

__post_init__

__post_init__() -> None

Derive n_cog_reads from the cog_reads list.

lazycogs.CogRead dataclass

Read details for one COG file within one chunk.

Attributes:

Name Type Description
item_id str

STAC item ID.

asset_key str

Asset key (band name) that would be read.

href str

Asset HREF.

overview_level int | None

Overview level that would be read. None means full resolution. Only populated when fetch_headers=True.

overview_resolution float | None

Pixel size of the selected level in source CRS units. Only populated when fetch_headers=True.

window_col_off int | None

Column offset of the read window in source pixels. Only populated when fetch_headers=True.

window_row_off int | None

Row offset of the read window in source pixels. Only populated when fetch_headers=True.

window_width int | None

Width of the read window in source pixels. Only populated when fetch_headers=True.

window_height int | None

Height of the read window in source pixels. Only populated when fetch_headers=True.

lazycogs.StacCogAccessor

xarray accessor adding explain functionality to lazycogs DataArrays.

Registered as the stac_cog namespace on all xr.DataArray objects. The :meth:explain method is only useful on DataArrays produced by :func:lazycogs.open.

__init__

__init__(da: DataArray) -> None

Initialise the accessor.

Parameters:

Name Type Description Default
da DataArray

The DataArray this accessor is attached to.

required

explain

explain(*, fetch_headers: bool = False) -> ExplainPlan

Return a dry-run read plan without fetching any pixel data.

Runs the same DuckDB spatial queries that would fire during .compute(), but stops before any COG pixel I/O. With fetch_headers=True the COG IFD headers are also fetched (one small HTTP range request per matched item) to determine which overview level and pixel window would be read.

Parameters:

Name Type Description Default
fetch_headers bool

When True, open each matched COG header to populate :attr:ItemRead.overview_level and the window fields. Requires network I/O. Defaults to False.

False

Returns:

Name Type Description
An ExplainPlan

class:ExplainPlan describing all (band, time step, spatial

ExplainPlan

tile) reads for the current DataArray extent and chunking.

Raises:

Type Description
ValueError

If the DataArray was not produced by lazycogs.open() (missing explain metadata in attrs).