Local¶
obstore.store.LocalStore ¶
An ObjectStore interface to local filesystem storage.
Can optionally be created with a directory prefix.
from pathlib import Path
store = LocalStore()
store = LocalStore(prefix="/path/to/directory")
store = LocalStore(prefix=Path("."))
prefix
property
¶
prefix: Path | None
Get the prefix applied to all operations in this store, if any.
__init__ ¶
__init__(
prefix: str | Path | None = None,
*,
automatic_cleanup: bool = False,
mkdir: bool = False,
) -> None
Create a new LocalStore.
Parameters:
-
prefix(str | Path | None, default:None) –Use the specified prefix applied to all paths. Defaults to
None.
Other Parameters:
-
automatic_cleanup(bool) –if
True, enables automatic cleanup of empty directories when deleting files. Defaults to False. -
mkdir(bool) –if
Trueandprefixis notNone, the directory atprefixwill attempt to be created. Note that this root directory will not be cleaned up, even ifautomatic_cleanupisTrue.
copy ¶
Copy an object from one path to another in the same object store.
Refer to the documentation for copy.
from_url
classmethod
¶
Construct a new LocalStore from a file:// URL.
Examples:
Construct a new store pointing to the root of your filesystem:
url = "file:///"
store = LocalStore.from_url(url)
Construct a new store with a directory prefix:
url = "file:///Users/kyle/"
store = LocalStore.from_url(url)
get ¶
get(path: str, *, options: GetOptions | None = None) -> GetResult
Return the bytes that are stored at the specified location.
Refer to the documentation for get.
get_async
async
¶
get_async(path: str, *, options: GetOptions | None = None) -> GetResult
Call get asynchronously.
Refer to the documentation for get.
get_range ¶
Return the bytes stored at the specified location in the given byte range.
Refer to the documentation for get_range.
get_range_async
async
¶
get_range_async(
path: str, *, start: int, end: int | None = None, length: int | None = None
) -> Bytes
Call get_range asynchronously.
Refer to the documentation for get_range.
get_ranges ¶
get_ranges(
path: str,
*,
starts: Sequence[int],
ends: Sequence[int] | None = None,
lengths: Sequence[int] | None = None,
) -> list[Bytes]
Return the bytes stored at the specified location in the given byte ranges.
Refer to the documentation for get_ranges.
get_ranges_async
async
¶
get_ranges_async(
path: str,
*,
starts: Sequence[int],
ends: Sequence[int] | None = None,
lengths: Sequence[int] | None = None,
) -> list[Bytes]
Call get_ranges asynchronously.
Refer to the documentation for get_ranges.
head ¶
head(path: str) -> ObjectMeta
Return the metadata for the specified location.
Refer to the documentation for head.
head_async
async
¶
head_async(path: str) -> ObjectMeta
Call head asynchronously.
Refer to the documentation for head_async.
list ¶
list(
prefix: str | None = None,
*,
offset: str | None = None,
chunk_size: int = 50,
return_arrow: Literal[True],
) -> ListStream[RecordBatch]
list(
prefix: str | None = None,
*,
offset: str | None = None,
chunk_size: int = 50,
return_arrow: Literal[False] = False,
) -> ListStream[Sequence[ObjectMeta]]
list(
prefix: str | None = None,
*,
offset: str | None = None,
chunk_size: int = 50,
return_arrow: bool = False,
) -> ListStream[RecordBatch] | ListStream[Sequence[ObjectMeta]]
List all the objects with the given prefix.
Refer to the documentation for list.
list_async ¶
list_async(
prefix: str | None = None,
*,
offset: str | None = None,
chunk_size: int = 50,
return_arrow: Literal[True],
) -> ListStream[RecordBatch]
list_async(
prefix: str | None = None,
*,
offset: str | None = None,
chunk_size: int = 50,
return_arrow: Literal[False] = False,
) -> ListStream[Sequence[ObjectMeta]]
list_async(
prefix: str | None = None,
*,
offset: str | None = None,
chunk_size: int = 50,
return_arrow: bool = False,
) -> ListStream[RecordBatch] | ListStream[Sequence[ObjectMeta]]
List all the objects with the given prefix.
Refer to the documentation for list.
Note
This is an alias for list, provided to match the ListAsync protocol in
obspec. There is no difference in functionality between this and the list
method.
list_with_delimiter ¶
list_with_delimiter(
prefix: str | None = None, *, return_arrow: Literal[True]
) -> ListResult[Table]
list_with_delimiter(
prefix: str | None = None, *, return_arrow: Literal[False] = False
) -> ListResult[Sequence[ObjectMeta]]
list_with_delimiter(
prefix: str | None = None, *, return_arrow: bool = False
) -> ListResult[Table] | ListResult[Sequence[ObjectMeta]]
List objects with the given prefix and an implementation specific delimiter.
Refer to the documentation for list_with_delimiter.
list_with_delimiter_async
async
¶
list_with_delimiter_async(
prefix: str | None = None, *, return_arrow: Literal[True]
) -> ListResult[Table]
list_with_delimiter_async(
prefix: str | None = None, *, return_arrow: Literal[False] = False
) -> ListResult[Sequence[ObjectMeta]]
list_with_delimiter_async(
prefix: str | None = None, *, return_arrow: bool = False
) -> ListResult[Table] | ListResult[Sequence[ObjectMeta]]
Call list_with_delimiter asynchronously.
Refer to the documentation for list_with_delimiter.
put ¶
put(
path: str,
file: IO[bytes]
| Path
| bytes
| Buffer
| Iterator[Buffer]
| Iterable[Buffer],
*,
attributes: Attributes | None = None,
tags: dict[str, str] | None = None,
mode: PutMode | None = None,
use_multipart: bool | None = None,
chunk_size: int = 5 * 1024 * 1024,
max_concurrency: int = 12,
) -> PutResult
Save the provided bytes to the specified location.
Refer to the documentation for put.
put_async
async
¶
put_async(
path: str,
file: IO[bytes]
| Path
| bytes
| Buffer
| AsyncIterator[Buffer]
| AsyncIterable[Buffer]
| Iterator[Buffer]
| Iterable[Buffer],
*,
attributes: Attributes | None = None,
tags: dict[str, str] | None = None,
mode: PutMode | None = None,
use_multipart: bool | None = None,
chunk_size: int = 5 * 1024 * 1024,
max_concurrency: int = 12,
) -> PutResult
Call put asynchronously.
Refer to the documentation for put. In addition to what the
synchronous put allows for the file parameter, this also supports an async
iterator or iterable of objects implementing the Python buffer protocol.
This means, for example, you can pass the result of get_async directly to
put_async, and the request will be streamed through Python during the put
operation:
import obstore as obs
# This only constructs the stream, it doesn't materialize the data in memory
resp = await obs.get_async(store1, path1)
# A streaming upload is created to copy the file to path2
await obs.put_async(store2, path2)