Utilities¶
lazycogs.align_bbox ¶
align_bbox(
affine: Affine | Sequence[float], bbox: tuple[float, float, float, float]
) -> tuple[float, float, float, float]
Snap a bounding box to the pixel grid defined by an affine transform.
Expands the bbox outward so that all four edges fall exactly on a grid
line. Useful for aligning an AOI to the native grid of a COG collection
(e.g. from a STAC item's proj:transform property) before calling
:func:lazycogs.open.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
affine
|
Affine | Sequence[float]
|
Affine transform in row-major order, either 6-element
|
required |
bbox
|
tuple[float, float, float, float]
|
|
required |
Returns:
| Type | Description |
|---|---|
float
|
|
float
|
lines. |
lazycogs.store_for ¶
store_for(
href: str,
*,
asset: str | None = None,
duckdb_client: DuckdbClient | None = None,
**kwargs: object,
) -> ObjectStore
Construct an ObjectStore by inspecting a geoparquet STAC items file.
Reads one sample item from href, derives the store root URL from a data
asset HREF, and constructs an ObjectStore with obstore's own
environment-based credential discovery. If the item carries STAC Storage
Extension metadata (v1.0.0 or v2.0.0), region and requester_pays
are also inferred automatically.
Caller-supplied kwargs override all inferred values; pass
skip_signature=True for public buckets that do not require signed
requests, or supply explicit credentials.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
href
|
str
|
Path to a geoparquet file or hive-partitioned parquet directory. |
required |
asset
|
str | None
|
Asset key to inspect when choosing a representative asset.
Defaults to the first data asset (role |
None
|
duckdb_client
|
DuckdbClient | None
|
Optional |
None
|
**kwargs
|
object
|
Forwarded to :func: |
{}
|
Returns:
| Type | Description |
|---|---|
ObjectStore
|
A freshly constructed |
Raises:
| Type | Description |
|---|---|
ValueError
|
If no STAC items are found in href. |
KeyError
|
If asset is specified but not present in the item. |
lazycogs.set_reproject_workers ¶
Set the number of threads each chunk's event loop uses for reprojection.
Each chunk read creates a fresh asyncio event loop with its own dedicated
ThreadPoolExecutor bounded to n workers. Dask tasks do not compete
for a shared pool — each task gets n independent reprojection threads.
Total reprojection threads at any moment is at most
n x dask_worker_count.
Reprojection is memory-bandwidth-bound rather than compute-bound, so values
above 4 typically offer no benefit and can hurt throughput due to memory
contention. The default is min(os.cpu_count(), 4).
To improve overall throughput, prefer adding time or band parallelism via
dask (chunks={"time": 1}) over raising this value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n
|
int
|
Number of worker threads per event loop. Must be >= 1. |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
lazycogs.ExplainPlan
dataclass
¶
Complete dry-run read plan for a lazycogs query.
Attributes:
| Name | Type | Description |
|---|---|---|
href |
str
|
Path to the source geoparquet file. |
crs |
str
|
String representation of the output CRS. |
resolution |
float
|
Output pixel size in CRS units. |
bands |
list[str]
|
Ordered list of band names included in the plan. |
time_coords |
list[datetime64]
|
Time coordinate values for all explained time steps. |
dst_width |
int
|
Output grid width in pixels (for the current DataArray extent). |
dst_height |
int
|
Output grid height in pixels (for the current DataArray extent). |
chunk_width |
int
|
Spatial chunk width in pixels. |
chunk_height |
int
|
Spatial chunk height in pixels. |
chunk_reads |
list[ChunkRead]
|
One entry per (band, time step, spatial tile). |
fetch_headers |
bool
|
Whether COG headers were opened to populate overview
and window fields on each :class: |
total_chunk_reads
property
¶
Total number of (band, time, spatial) chunk reads.
to_dataframe ¶
Return a DataFrame with one row per (chunk x item) combination.
Empty chunks contribute one row with item fields set to None.
When fetch_headers=False, the overview and window columns are
all None.
Returns:
| Type | Description |
|---|---|
DataFrame
|
A |
DataFrame
|
metadata, and (when available) COG header details. |
Raises:
| Type | Description |
|---|---|
ImportError
|
If |
lazycogs.ChunkRead
dataclass
¶
All reads required for one (band, time step, spatial tile).
Attributes:
| Name | Type | Description |
|---|---|---|
band |
str
|
Asset key for this chunk. |
time_index |
int
|
Index of this time step in the full time axis. |
date_filter |
str
|
|
time_coord |
datetime64
|
Coordinate value for this time step. |
chunk_row |
int
|
Tile row index within the spatial grid (0-indexed). |
chunk_col |
int
|
Tile column index within the spatial grid (0-indexed). |
chunk_affine |
Affine
|
Affine transform of the tile (top-left origin). |
chunk_width |
int
|
Tile width in pixels. |
chunk_height |
int
|
Tile height in pixels. |
cog_reads |
list[CogRead]
|
Per-COG read details. |
n_cog_reads |
int
|
Number of COG files matched (derived from |
lazycogs.CogRead
dataclass
¶
Read details for one COG file within one chunk.
Attributes:
| Name | Type | Description |
|---|---|---|
item_id |
str
|
STAC item ID. |
asset_key |
str
|
Asset key (band name) that would be read. |
href |
str
|
Asset HREF. |
overview_level |
int | None
|
Overview level that would be read. |
overview_resolution |
float | None
|
Pixel size of the selected level in source CRS
units. Only populated when |
window_col_off |
int | None
|
Column offset of the read window in source pixels.
Only populated when |
window_row_off |
int | None
|
Row offset of the read window in source pixels.
Only populated when |
window_width |
int | None
|
Width of the read window in source pixels.
Only populated when |
window_height |
int | None
|
Height of the read window in source pixels.
Only populated when |
lazycogs.StacCogAccessor ¶
xarray accessor adding explain functionality to lazycogs DataArrays.
Registered as the stac_cog namespace on all xr.DataArray objects.
The :meth:explain method is only useful on DataArrays produced by
:func:lazycogs.open.
__init__ ¶
Initialise the accessor.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
da
|
DataArray
|
The DataArray this accessor is attached to. |
required |
explain ¶
explain(*, fetch_headers: bool = False) -> ExplainPlan
Return a dry-run read plan without fetching any pixel data.
Runs the same DuckDB spatial queries that would fire during
.compute(), but stops before any COG pixel I/O. With
fetch_headers=True the COG IFD headers are also fetched (one
small HTTP range request per matched item) to determine which overview
level and pixel window would be read.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fetch_headers
|
bool
|
When |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
An |
ExplainPlan
|
class: |
ExplainPlan
|
tile) reads for the current DataArray extent and chunking. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the DataArray was not produced by
|