Skip to content

Get

obstore.get

get(
    store: ObjectStore, path: str, *, options: GetOptions | None = None
) -> GetResult

Return the bytes that are stored at the specified location.

Parameters:

  • store (ObjectStore) –

    The ObjectStore instance to use.

  • path (str) –

    The path within ObjectStore to retrieve.

  • options (GetOptions | None, default: None ) –

    options for accessing the file. Defaults to None.

Returns:

obstore.get_async async

get_async(
    store: ObjectStore, path: str, *, options: GetOptions | None = None
) -> GetResult

Call get asynchronously.

Refer to the documentation for get.

obstore.get_range

get_range(store: ObjectStore, path: str, start: int, end: int) -> Buffer

Return the bytes that are stored at the specified location in the given byte range.

If the given range is zero-length or starts after the end of the object, an error will be returned. Additionally, if the range ends after the end of the object, the entire remainder of the object will be returned. Otherwise, the exact requested range will be returned.

Parameters:

  • store (ObjectStore) –

    The ObjectStore instance to use.

  • path (str) –

    The path within ObjectStore to retrieve.

  • start (int) –

    The start of the byte range.

  • end (int) –

    The end of the byte range (exclusive).

Returns:

  • Buffer

    A Buffer object implementing the Python buffer protocol, allowing zero-copy access to the underlying memory provided by Rust.

obstore.get_range_async async

get_range_async(store: ObjectStore, path: str, start: int, end: int) -> Buffer

Call get_range asynchronously.

Refer to the documentation for get_range.

obstore.get_ranges

get_ranges(
    store: ObjectStore, path: str, starts: Sequence[int], ends: Sequence[int]
) -> List[Buffer]

Return the bytes that are stored at the specified location in the given byte ranges

To improve performance this will:

  • Combine ranges less than 10MB apart into a single call to fetch
  • Make multiple fetch requests in parallel (up to maximum of 10)

Parameters:

  • store (ObjectStore) –

    The ObjectStore instance to use.

  • path (str) –

    The path within ObjectStore to retrieve.

  • starts (Sequence[int]) –

    A sequence of int where each offset starts.

  • ends (Sequence[int]) –

    A sequence of int where each offset ends (exclusive).

Returns:

  • List[Buffer]

    A sequence of Buffer, one for each range. This Buffer object implements the Python buffer protocol, allowing zero-copy access to the underlying memory provided by Rust.

obstore.get_ranges_async async

get_ranges_async(
    store: ObjectStore, path: str, starts: Sequence[int], ends: Sequence[int]
) -> List[Buffer]

Call get_ranges asynchronously.

Refer to the documentation for get_ranges.

obstore.GetOptions

Bases: TypedDict

Options for a get request.

All options are optional.

head instance-attribute

head: bool

Request transfer of no content

datatracker.ietf.org/doc/html/rfc9110#name-head

if_match instance-attribute

if_match: str | None

Request will succeed if the ObjectMeta::e_tag matches otherwise returning PreconditionError.

See datatracker.ietf.org/doc/html/rfc9110#name-if-match

Examples:

If-Match: "xyzzy"
If-Match: "xyzzy", "r2d2xxxx", "c3piozzzz"
If-Match: *

if_modified_since instance-attribute

if_modified_since: datetime | None

Request will succeed if the object has not been modified since otherwise returning PreconditionError.

Some stores, such as S3, will only return NotModified for exact timestamp matches, instead of for any timestamp greater than or equal.

datatracker.ietf.org/doc/html/rfc9110#section-13.1.4

if_none_match instance-attribute

if_none_match: str | None

Request will succeed if the ObjectMeta::e_tag does not match otherwise returning NotModifiedError.

See datatracker.ietf.org/doc/html/rfc9110#section-13.1.2

Examples:

If-None-Match: "xyzzy"
If-None-Match: "xyzzy", "r2d2xxxx", "c3piozzzz"
If-None-Match: *

if_unmodified_since instance-attribute

if_unmodified_since: datetime | None

Request will succeed if the object has been modified since

datatracker.ietf.org/doc/html/rfc9110#section-13.1.3

range instance-attribute

Request transfer of only the specified range of bytes otherwise returning NotModifiedError.

The semantics of this tuple are:

  • (int, int): Request a specific range of bytes (start, end).

    If the given range is zero-length or starts after the end of the object, an error will be returned. Additionally, if the range ends after the end of the object, the entire remainder of the object will be returned. Otherwise, the exact requested range will be returned.

    The end offset is exclusive.

  • {"offset": int}: Request all bytes starting from a given byte offset.

    This is equivalent to bytes={int}- as an HTTP header.

  • {"suffix": int}: Request the last int bytes. Note that here, int is the size of the request, not the byte offset. This is equivalent to bytes=-{int} as an HTTP header.

datatracker.ietf.org/doc/html/rfc9110#name-range

version instance-attribute

version: str | None

Request a particular object version

obstore.GetResult

Result for a get request.

You can materialize the entire buffer by using either bytes or bytes_async, or you can stream the result using stream. __iter__ and __aiter__ are implemented as aliases to stream, so you can alternatively call iter() or aiter() on GetResult to start an iterator.

Using as an async iterator:

resp = await obs.get_async(store, path)
# 5MB chunk size in stream
stream = resp.stream(min_chunk_size=5 * 1024 * 1024)
async for buf in stream:
    print(len(buf))

Using as a sync iterator:

resp = obs.get(store, path)
# 20MB chunk size in stream
stream = resp.stream(min_chunk_size=20 * 1024 * 1024)
for buf in stream:
    print(len(buf))

Note that after calling bytes, bytes_async, or stream, you will no longer be able to call other methods on this object, such as the meta attribute.

attributes property

attributes: Attributes

Additional object attributes.

This must be accessed before calling stream, bytes, or bytes_async.

meta property

meta: ObjectMeta

The ObjectMeta for this object.

This must be accessed before calling stream, bytes, or bytes_async.

range property

range: Tuple[int, int]

The range of bytes returned by this request.

This must be accessed before calling stream, bytes, or bytes_async.

__aiter__

__aiter__() -> BytesStream

Return a chunked stream over the result's bytes with the default (10MB) chunk size.

__iter__

__iter__() -> BytesStream

Return a chunked stream over the result's bytes with the default (10MB) chunk size.

bytes

bytes() -> bytes

Collects the data into bytes

bytes_async async

bytes_async() -> bytes

Collects the data into bytes

stream

stream(min_chunk_size: int = 10 * 1024 * 1024) -> BytesStream

Return a chunked stream over the result's bytes.

Parameters:

  • min_chunk_size (int, default: 10 * 1024 * 1024 ) –

    The minimum size in bytes for each chunk in the returned BytesStream. All chunks except for the last chunk will be at least this size. Defaults to 1010241024 (10MB).

Returns:

  • BytesStream

    A chunked stream

obstore.Buffer

Bases: Buffer

A buffer implementing the Python buffer protocol, allowing zero-copy access to the underlying memory provided by Rust.

You can pass this to memoryview for a zero-copy view into the underlying data.

as_bytes

as_bytes() -> bytes

Copy this buffer into a Python bytes object.

obstore.OffsetRange

Bases: TypedDict

Request all bytes starting from a given byte offset

offset instance-attribute

offset: int

The byte offset for the offset range request.

obstore.SuffixRange

Bases: TypedDict

Request up to the last n bytes

suffix instance-attribute

suffix: int

The number of bytes from the suffix to request.