Utilities¶
lazycogs.align_bbox ¶
align_bbox(
affine: Affine | Sequence[float], bbox: tuple[float, float, float, float]
) -> tuple[float, float, float, float]
Snap a bounding box to the pixel grid defined by an affine transform.
Expands the bbox outward so that all four edges fall exactly on a grid
line. Useful for aligning an AOI to the native grid of a COG collection
(e.g. from a STAC item's proj:transform property) before calling
:func:lazycogs.open.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
affine
|
Affine | Sequence[float]
|
Affine transform in row-major order, either 6-element
|
required |
bbox
|
tuple[float, float, float, float]
|
|
required |
Returns:
| Type | Description |
|---|---|
float
|
|
float
|
lines. |
lazycogs.store_for ¶
store_for(
href: str,
*,
asset: str | None = None,
duckdb_client: DuckdbClient | None = None,
**kwargs: object,
) -> ObjectStore
Construct an ObjectStore by inspecting a stac-geoparquet sample asset.
Reads one sample item from href, derives the store root URL from a data
asset HREF, and constructs an ObjectStore with obstore's own
environment-based credential discovery. If the item carries STAC Storage
Extension metadata (v1.0.0 or v2.0.0), region and requester_pays
are also inferred automatically.
Caller-supplied kwargs override all inferred values; pass
skip_signature=True for public buckets that do not require signed
requests, or supply explicit credentials.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
href
|
str
|
Path to a geoparquet file or hive-partitioned parquet directory. |
required |
asset
|
str | None
|
Asset key to inspect when choosing a representative asset.
Defaults to the first data asset (role |
None
|
duckdb_client
|
DuckdbClient | None
|
Optional |
None
|
**kwargs
|
object
|
Forwarded to :func: |
{}
|
Returns:
| Type | Description |
|---|---|
ObjectStore
|
A freshly constructed |
Raises:
| Type | Description |
|---|---|
ValueError
|
If no STAC items are found in href. |
KeyError
|
If asset is specified but not present in the item. |
lazycogs.set_reproject_workers ¶
Set the number of threads each thread's event loop uses for reprojection.
Each thread (dask worker, Jupyter kernel callback thread, etc.) gets one
persistent background event loop with one bounded reprojection
ThreadPoolExecutor. All chunk reads on that thread share the same loop
and executor. Dask tasks on different threads do not compete for a shared
pool. Total reprojection threads at any moment is at most
n x active_thread_count.
Reprojection is memory-bandwidth-bound rather than compute-bound, so values
above 4 typically offer no benefit and can hurt throughput due to memory
contention. The default is min(os.cpu_count(), 4).
To improve overall throughput, prefer adding time or band parallelism via
dask (chunks={"time": 1}) over raising this value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n
|
int
|
Number of worker threads per event loop. Must be >= 1. |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
lazycogs.ExplainPlan
dataclass
¶
Complete dry-run read plan for a lazycogs query.
Attributes:
| Name | Type | Description |
|---|---|---|
href |
str
|
Path to the source geoparquet file. |
crs |
str
|
String representation of the output CRS. |
resolution |
float
|
Output pixel size in CRS units. |
bands |
list[str]
|
Ordered list of band names included in the plan. |
time_coords |
list[datetime64]
|
Time coordinate values for all explained time steps. |
dst_width |
int
|
Output grid width in pixels (for the current DataArray extent). |
dst_height |
int
|
Output grid height in pixels (for the current DataArray extent). |
chunk_width |
int
|
Spatial chunk width in pixels. |
chunk_height |
int
|
Spatial chunk height in pixels. |
chunk_reads |
list[ChunkRead]
|
One entry per (band, time step, spatial tile). |
fetch_headers |
bool
|
Whether COG headers were opened to populate overview
and window fields on each :class: |
total_chunk_reads
property
¶
Total number of (band, time, spatial) chunk reads.
to_dataframe ¶
Return a DataFrame with one row per (chunk x item) combination.
Empty chunks contribute one row with item fields set to None.
When fetch_headers=False, the overview and window columns are
all None.
Returns:
| Type | Description |
|---|---|
DataFrame
|
A |
DataFrame
|
metadata, and (when available) COG header details. |
Raises:
| Type | Description |
|---|---|
ImportError
|
If |
lazycogs.ChunkRead
dataclass
¶
All reads required for one (band, time step, spatial tile).
Attributes:
| Name | Type | Description |
|---|---|---|
band |
str
|
Asset key for this chunk. |
time_index |
int
|
Index of this time step in the full time axis. |
date_filter |
str
|
|
time_coord |
datetime64
|
Coordinate value for this time step. |
chunk_row |
int
|
Tile row index within the spatial grid (0-indexed). |
chunk_col |
int
|
Tile column index within the spatial grid (0-indexed). |
chunk_affine |
Affine
|
Affine transform of the tile (top-left origin). |
chunk_width |
int
|
Tile width in pixels. |
chunk_height |
int
|
Tile height in pixels. |
cog_reads |
list[CogRead]
|
Per-COG read details. |
n_cog_reads |
int
|
Number of COG files matched (derived from |
lazycogs.CogRead
dataclass
¶
Read details for one COG file within one chunk.
Attributes:
| Name | Type | Description |
|---|---|---|
item_id |
str
|
STAC item ID. |
asset_key |
str
|
Asset key (band name) that would be read. |
href |
str
|
Asset HREF. |
overview_level |
int | None
|
Overview level that would be read. |
overview_resolution |
float | None
|
Pixel size of the selected level in source CRS
units. Only populated when |
window_col_off |
int | None
|
Column offset of the read window in source pixels.
Only populated when |
window_row_off |
int | None
|
Row offset of the read window in source pixels.
Only populated when |
window_width |
int | None
|
Width of the read window in source pixels.
Only populated when |
window_height |
int | None
|
Height of the read window in source pixels.
Only populated when |
lazycogs.StacCogAccessor ¶
xarray accessor adding explain functionality to lazycogs DataArrays.
Registered as the stac_cog namespace on all xr.DataArray objects.
The :meth:explain method is only useful on DataArrays produced by
:func:lazycogs.open.
__init__ ¶
Initialise the accessor.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
da
|
DataArray
|
The DataArray this accessor is attached to. |
required |
explain ¶
explain(*, fetch_headers: bool = False) -> ExplainPlan
Return a dry-run read plan without fetching any pixel data.
Runs the same DuckDB spatial queries that would fire during
.compute(), but stops before any COG pixel I/O. With
fetch_headers=True the COG IFD headers are also fetched (one
small HTTP range request per matched item) to determine which overview
level and pixel window would be read.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fetch_headers
|
bool
|
When |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
An |
ExplainPlan
|
class: |
ExplainPlan
|
tile) reads for the current DataArray extent and chunking. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the DataArray was not produced by
|