ZarrTable¶
zarr_datafusion_search.ZarrTable ¶
A DataFusion table provider that exposes a Zarr metadata store as a SQL-queryable table.
ZarrTable implements the DataFusion TableProvider protocol via the FFI boundary,
so it can be registered directly with a SessionContext using
register_table_provider.
Use one of the async class methods to construct an instance:
from_icechunk— from an Icechunk sessionfrom_obstore— from an obstore object store
Example
import asyncio
from datafusion import SessionContext
from zarr_datafusion_search import ZarrTable
async def main():
table = await ZarrTable.from_obstore(store, "/meta")
ctx = SessionContext()
ctx.register_table_provider("items", table)
df = ctx.sql("SELECT date, collection FROM items LIMIT 10")
df.show()
asyncio.run(main())
__datafusion_table_provider__ ¶
Return the FFI TableProvider capsule for DataFusion registration.
This is called automatically by SessionContext.register_table_provider.
You do not need to call it directly.
from_icechunk
classmethod
¶
from_icechunk(session: Any, group_path: str) -> Awaitable[ZarrTable]
Create a ZarrTable from an Icechunk session.
Parameters:
-
session(Any) –An open
icechunk.Sessionpointing to the store. -
group_path(str) –Absolute path to the Zarr group containing the metadata arrays (e.g.
"/meta").
Returns:
-
Awaitable[ZarrTable]–An awaitable that resolves to a
ZarrTableinstance.
from_obstore
classmethod
¶
from_obstore(store: Any, group_path: str) -> Awaitable[ZarrTable]
Create a ZarrTable from an obstore object store.
Parameters:
-
store(Any) –Any obstore-compatible object store (e.g.
obstore.store.S3Store,obstore.store.LocalStore). -
group_path(str) –Absolute path to the Zarr group containing the metadata arrays (e.g.
"/meta").
Returns:
-
Awaitable[ZarrTable]–An awaitable that resolves to a
ZarrTableinstance.