Get¶

obstore.get ¶

get(
    store: ObjectStore, path: str, *, options: GetOptions | None = None
) -> GetResult

Return the bytes that are stored at the specified location.

Parameters:

store (ObjectStore) –

The ObjectStore instance to use.
path (str) –

The path within ObjectStore to retrieve.
options (GetOptions | None, default: None ) –

options for accessing the file. Defaults to None.

Returns:

GetResult –

GetResult

obstore.get_async `async` ¶

get_async(
    store: ObjectStore, path: str, *, options: GetOptions | None = None
) -> GetResult

Call get asynchronously.

Refer to the documentation for get.

obstore.get_range ¶

get_range(
    store: ObjectStore,
    path: str,
    *,
    start: int,
    end: int | None = None,
    length: int | None = None,
) -> Bytes

Return the bytes that are stored at the specified location in the given byte range.

If the given range is zero-length or starts after the end of the object, an error will be returned. Additionally, if the range ends after the end of the object, the entire remainder of the object will be returned. Otherwise, the exact requested range will be returned.

Parameters:

store (ObjectStore) –

The ObjectStore instance to use.
path (str) –

The path within ObjectStore to retrieve.

Other Parameters:

start (int) –

The start of the byte range.
end (int | None) –

The end of the byte range (exclusive). Either end or length must be non-None.
length (int | None) –

The number of bytes of the byte range. Either end or length must be non-None.

Returns:

Bytes –

A Bytes object implementing the Python buffer protocol, allowing zero-copy access to the underlying memory provided by Rust.

obstore.get_range_async `async` ¶

get_range_async(
    store: ObjectStore,
    path: str,
    *,
    start: int,
    end: int | None = None,
    length: int | None = None,
) -> Bytes

Call get_range asynchronously.

Refer to the documentation for get_range.

obstore.get_ranges ¶

get_ranges(
    store: ObjectStore,
    path: str,
    *,
    starts: Sequence[int],
    ends: Sequence[int] | None = None,
    lengths: Sequence[int] | None = None,
) -> list[Bytes]

Return the bytes stored at the specified location in the given byte ranges.

To improve performance this will:

Transparently combine ranges less than 1MB apart into a single underlying request
Make multiple fetch requests in parallel (up to maximum of 10)

Parameters:

store (ObjectStore) –

The ObjectStore instance to use.
path (str) –

The path within ObjectStore to retrieve.

Other Parameters:

starts (Sequence[int]) –

A sequence of int where each offset starts.
ends (Sequence[int] | None) –

A sequence of int where each offset ends (exclusive). Either ends or lengths must be non-None.
lengths (Sequence[int] | None) –

A sequence of int with the number of bytes of each byte range. Either ends or lengths must be non-None.

Returns:

list[Bytes] –

A sequence of Bytes, one for each range. This Bytes object implements the Python buffer protocol, allowing zero-copy access to the underlying memory provided by Rust.

obstore.get_ranges_async `async` ¶

get_ranges_async(
    store: ObjectStore,
    path: str,
    *,
    starts: Sequence[int],
    ends: Sequence[int] | None = None,
    lengths: Sequence[int] | None = None,
) -> list[Bytes]

Call get_ranges asynchronously.

Refer to the documentation for get_ranges.

obstore.GetOptions ¶

Bases: TypedDict

Options for a get request.

All options are optional.

Not importable at runtime

To use this type hint in your code, import it within a TYPE_CHECKING block:

from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
    from obstore import GetOptions

head `instance-attribute` ¶

head: bool

Request transfer of no content datatracker.ietf.org/doc/html/rfc9110#name-head

if_match `instance-attribute` ¶

if_match: str | None

Request will succeed if the ObjectMeta::e_tag matches otherwise returning PreconditionError. See datatracker.ietf.org/doc/html/rfc9110#name-if-match Examples:

If-Match: "xyzzy"
If-Match: "xyzzy", "r2d2xxxx", "c3piozzzz"
If-Match: *

if_modified_since `instance-attribute` ¶

if_modified_since: datetime | None

Request will succeed if the object has not been modified since otherwise returning PreconditionError. Some stores, such as S3, will only return NotModified for exact timestamp matches, instead of for any timestamp greater than or equal. datatracker.ietf.org/doc/html/rfc9110#section-13.1.4

if_none_match `instance-attribute` ¶

if_none_match: str | None

Request will succeed if the ObjectMeta::e_tag does not match otherwise returning NotModifiedError. See datatracker.ietf.org/doc/html/rfc9110#section-13.1.2 Examples:

If-None-Match: "xyzzy"
If-None-Match: "xyzzy", "r2d2xxxx", "c3piozzzz"
If-None-Match: *

if_unmodified_since `instance-attribute` ¶

if_unmodified_since: datetime | None

Request will succeed if the object has been modified since datatracker.ietf.org/doc/html/rfc9110#section-13.1.3

range `instance-attribute` ¶

range: tuple[int, int] | Sequence[int] | OffsetRange | SuffixRange

Request transfer of only the specified range of bytes otherwise returning NotModifiedError. The semantics of this tuple are: - (int, int): Request a specific range of bytes (start, end). If the given range is zero-length or starts after the end of the object, an error will be returned. Additionally, if the range ends after the end of the object, the entire remainder of the object will be returned. Otherwise, the exact requested range will be returned. The end offset is exclusive. - {"offset": int}: Request all bytes starting from a given byte offset. This is equivalent to bytes={int}- as an HTTP header. - {"suffix": int}: Request the last int bytes. Note that here, int is the size of the request, not the byte offset. This is equivalent to bytes=-{int} as an HTTP header. datatracker.ietf.org/doc/html/rfc9110#name-range

version `instance-attribute` ¶

version: str | None

Request a particular object version

obstore.GetResult ¶

Result for a get request.

You can materialize the entire buffer by using either bytes or bytes_async, or you can stream the result using stream. __iter__ and __aiter__ are implemented as aliases to stream, so you can alternatively call iter() or aiter() on GetResult to start an iterator.

Using as an async iterator:

resp = await obs.get_async(store, path)
# 5MB chunk size in stream
stream = resp.stream(min_chunk_size=5 * 1024 * 1024)
async for buf in stream:
    print(len(buf))

Using as a sync iterator:

resp = obs.get(store, path)
# 20MB chunk size in stream
stream = resp.stream(min_chunk_size=20 * 1024 * 1024)
for buf in stream:
    print(len(buf))

Not importable at runtime

To use this type hint in your code, import it within a TYPE_CHECKING block:

from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
    from obstore import GetResult

attributes `property` ¶

attributes: Attributes

Additional object attributes.

meta `property` ¶

meta: ObjectMeta

The ObjectMeta for this object.

range `property` ¶

range: tuple[int, int]

The range of bytes returned by this request.

Note that this is (start, stop) not (start, length).

aiter ¶

__aiter__() -> BytesStream

Return a chunked stream over the result's bytes.

Uses the default (10MB) chunk size.

iter ¶

__iter__() -> BytesStream

Return a chunked stream over the result's bytes.

Uses the default (10MB) chunk size.

buffer ¶

buffer() -> Bytes

Collect the data into a Bytes object.

This is an alias of the bytes() method to comply with the obspec.Get protocol.

buffer_async `async` ¶

buffer_async() -> Bytes

Collect the data into a Bytes object.

This is an alias of the bytes_async() method to comply with the obspec.GetAsync protocol.

bytes ¶

bytes() -> Bytes

Collect the data into a Bytes object.

This implements the Python buffer protocol. You can copy the buffer to Python memory by passing to bytes.

bytes_async `async` ¶

bytes_async() -> Bytes

Collect the data into a Bytes object.

This implements the Python buffer protocol. You can copy the buffer to Python memory by passing to bytes.

stream ¶

stream(min_chunk_size: int = 10 * 1024 * 1024) -> BytesStream

Return a chunked stream over the result's bytes.

Parameters:

min_chunk_size (int, default: 10 * 1024 * 1024 ) –

The minimum size in bytes for each chunk in the returned BytesStream. All chunks except for the last chunk will be at least this size. Defaults to 10*1024*1024 (10MB).

Returns:

BytesStream –

A chunked stream

obstore.BytesStream ¶

An async stream of bytes.

Request timeouts

The underlying stream needs to stay alive until the last chunk is polled. If the file is large, it may exceed the default timeout of 30 seconds. In this case, you may see an error like:

GenericError: Generic {
    store: "HTTP",
    source: reqwest::Error {
        kind: Decode,
        source: reqwest::Error {
            kind: Body,
            source: TimedOut,
        },
    },
}

To fix this, set the timeout parameter in the client_options passed when creating the store.

Not importable at runtime

To use this type hint in your code, import it within a TYPE_CHECKING block:

from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
    from obstore import BytesStream

aiter ¶

__aiter__() -> BytesStream

Return Self as an async iterator.

anext `async` ¶

__anext__() -> bytes

Return the next chunk of bytes in the stream.

iter ¶

__iter__() -> BytesStream

Return Self as an async iterator.

next ¶

__next__() -> bytes

Return the next chunk of bytes in the stream.

obstore.Bytes ¶

Bases: Buffer

A bytes-like buffer.

This implements the Python buffer protocol, allowing zero-copy access to underlying Rust memory.

You can pass this to memoryview for a zero-copy view into the underlying data or to bytes to copy the underlying data into a Python bytes.

Many methods from the Python bytes class are implemented on this,

init ¶

__init__(buf: Buffer = b'') -> None

Construct a new Bytes object.

This will be a zero-copy view on the Python byte slice.

isalnum ¶

isalnum() -> bool

Return True if all bytes in the sequence are alphabetical ASCII characters or ASCII decimal digits and the sequence is not empty, False otherwise.

Alphabetic ASCII characters are those byte values in the sequence b'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'. ASCII decimal digits are those byte values in the sequence b'0123456789'.

isalpha ¶

isalpha() -> bool

Return True if all bytes in the sequence are alphabetic ASCII characters and the sequence is not empty, False otherwise.

Alphabetic ASCII characters are those byte values in the sequence b'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'.

isascii ¶

isascii() -> bool

Return True if the sequence is empty or all bytes in the sequence are ASCII, False otherwise.

ASCII bytes are in the range 0-0x7F.

isdigit ¶

isdigit() -> bool

Return True if all bytes in the sequence are ASCII decimal digits and the sequence is not empty, False otherwise.

ASCII decimal digits are those byte values in the sequence b'0123456789'.

islower ¶

islower() -> bool

Return True if there is at least one lowercase ASCII character in the sequence and no uppercase ASCII characters, False otherwise.

isspace ¶

isspace() -> bool

Return True if all bytes in the sequence are ASCII whitespace and the sequence is not empty, False otherwise.

ASCII whitespace characters are those byte values in the sequence b' \t\n\r\x0b\f' (space, tab, newline, carriage return, vertical tab, form feed).

isupper ¶

isupper() -> bool

Return True if there is at least one uppercase alphabetic ASCII character in the sequence and no lowercase ASCII characters, False otherwise.

lower ¶

lower() -> Bytes

Return a copy of the sequence with all the uppercase ASCII characters converted to their corresponding lowercase counterpart.

removeprefix ¶

removeprefix(prefix: Buffer) -> Bytes

If the binary data starts with the prefix string, return bytes[len(prefix):]. Otherwise, return the original binary data.

removesuffix ¶

removesuffix(suffix: Buffer) -> Bytes

If the binary data ends with the suffix string and that suffix is not empty, return bytes[:-len(suffix)]. Otherwise, return the original binary data.

to_bytes ¶

to_bytes() -> bytes

Copy this buffer's contents into a Python bytes object.

upper ¶

upper() -> Bytes

Return a copy of the sequence with all the lowercase ASCII characters converted to their corresponding uppercase counterpart.

obstore.OffsetRange ¶

Bases: TypedDict

Request all bytes starting from a given byte offset.

Not importable at runtime

To use this type hint in your code, import it within a TYPE_CHECKING block:

from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
    from obstore import OffsetRange

offset `instance-attribute` ¶

offset: int

The byte offset for the offset range request.

obstore.SuffixRange ¶

Bases: TypedDict

Request up to the last n bytes.

Not importable at runtime

To use this type hint in your code, import it within a TYPE_CHECKING block:

from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
    from obstore import SuffixRange

suffix `instance-attribute` ¶

suffix: int

The number of bytes from the suffix to request.

Get¶

obstore.get ¶

obstore.get_async async ¶

obstore.get_range ¶

obstore.get_range_async async ¶

obstore.get_ranges ¶

obstore.get_ranges_async async ¶

obstore.GetOptions ¶

head instance-attribute ¶

if_match instance-attribute ¶

if_modified_since instance-attribute ¶

if_none_match instance-attribute ¶

if_unmodified_since instance-attribute ¶

range instance-attribute ¶

version instance-attribute ¶

obstore.GetResult ¶

attributes property ¶

meta property ¶

range property ¶

__aiter__ ¶

__iter__ ¶

buffer ¶

buffer_async async ¶

bytes ¶

bytes_async async ¶

stream ¶

obstore.BytesStream ¶

__aiter__ ¶

__anext__ async ¶

__iter__ ¶

__next__ ¶

obstore.Bytes ¶

__init__ ¶

isalnum ¶

isalpha ¶

isascii ¶

isdigit ¶

islower ¶

isspace ¶

isupper ¶

lower ¶

removeprefix ¶

removesuffix ¶

to_bytes ¶

upper ¶

obstore.OffsetRange ¶

offset instance-attribute ¶

obstore.SuffixRange ¶

suffix instance-attribute ¶

obstore.get_async `async` ¶

obstore.get_range_async `async` ¶

obstore.get_ranges_async `async` ¶

head `instance-attribute` ¶

if_match `instance-attribute` ¶

if_modified_since `instance-attribute` ¶

if_none_match `instance-attribute` ¶

if_unmodified_since `instance-attribute` ¶

range `instance-attribute` ¶

version `instance-attribute` ¶

attributes `property` ¶

meta `property` ¶

range `property` ¶

aiter ¶

iter ¶

buffer_async `async` ¶

bytes_async `async` ¶

aiter ¶

anext `async` ¶

iter ¶

next ¶

init ¶

offset `instance-attribute` ¶

suffix `instance-attribute` ¶