Skip to content

Put

obstore.put

put(
    store: ObjectStore,
    path: str,
    file: IO[bytes]
    | Path
    | bytes
    | Buffer
    | Iterator[Buffer]
    | Iterable[Buffer],
    *,
    attributes: Attributes | None = None,
    tags: Dict[str, str] | None = None,
    mode: PutMode | None = None,
    use_multipart: bool | None = None,
    chunk_size: int = 5 * 1024 * 1024,
    max_concurrency: int = 12,
) -> PutResult

Save the provided bytes to the specified location

The operation is guaranteed to be atomic, it will either successfully write the entirety of file to location, or fail. No clients should be able to observe a partially written object.

Aborted multipart uploads

This function will automatically use multipart uploads under the hood for large file objects (whenever the length of the file is greater than chunk_size) or for iterable or async iterable input.

Multipart uploads have a variety of advantages, including performance and reliability.

However, aborted or incomplete multipart uploads can leave partial content in a hidden state in your bucket, silently adding to your storage costs. It's recommended to configure lifecycle rules to automatically delete aborted multipart uploads. See here for the AWS S3 documentation, for example.

You can turn off multipart uploads by passing use_multipart=False.

Parameters:

  • store (ObjectStore) –

    The ObjectStore instance to use.

  • path (str) –

    The path within ObjectStore for where to save the file.

  • file (IO[bytes] | Path | bytes | Buffer | Iterator[Buffer] | Iterable[Buffer]) –

    The object to upload. Supports various input:

    • A file-like object opened in binary read mode
    • A Path to a local file
    • A bytes object.
    • Any object implementing the Python buffer protocol (includes bytes but also memoryview, numpy arrays, and more).
    • An iterator or iterable of objects implementing the Python buffer protocol.

Other Parameters:

  • mode (PutMode | None) –

    Configure the PutMode for this operation. Refer to the PutMode docstring for more information.

    If this provided and is not "overwrite", a non-multipart upload will be performed. Defaults to "overwrite".

  • attributes (Attributes | None) –

    Provide a set of Attributes. Defaults to None.

  • tags (Dict[str, str] | None) –

    Provide tags for this object. Defaults to None.

  • use_multipart (bool | None) –

    Whether to use a multipart upload under the hood. Defaults using a multipart upload if the length of the file is greater than chunk_size. When use_multipart is False, the entire input will be materialized in memory as part of the upload.

  • chunk_size (int) –

    The size of chunks to use within each part of the multipart upload. Defaults to 5 MB.

  • max_concurrency (int) –

    The maximum number of chunks to upload concurrently. Defaults to 12.

obstore.put_async async

put_async(
    store: ObjectStore,
    path: str,
    file: IO[bytes]
    | Path
    | bytes
    | Buffer
    | AsyncIterator[Buffer]
    | AsyncIterable[Buffer]
    | Iterator[Buffer]
    | Iterable[Buffer],
    *,
    attributes: Attributes | None = None,
    tags: Dict[str, str] | None = None,
    mode: PutMode | None = None,
    use_multipart: bool | None = None,
    chunk_size: int = 5 * 1024 * 1024,
    max_concurrency: int = 12,
) -> PutResult

Call put asynchronously.

Refer to the documentation for put. In addition to what the synchronous put allows for the file parameter, this also supports an async iterator or iterable of objects implementing the Python buffer protocol.

This means, for example, you can pass the result of get_async directly to put_async, and the request will be streamed through Python during the put operation:

import obstore as obs

# This only constructs the stream, it doesn't materialize the data in memory
resp = await obs.get_async(store1, path1)
# A streaming upload is created to copy the file to path2
await obs.put_async(store2, path2)

obstore.PutResult

Bases: TypedDict

Result for a put request.

e_tag instance-attribute

e_tag: str | None

The unique identifier for the newly created object

datatracker.ietf.org/doc/html/rfc9110#name-etag

version instance-attribute

version: str | None

A version indicator for the newly created object.

obstore.UpdateVersion

Bases: TypedDict

Uniquely identifies a version of an object to update

Stores will use differing combinations of e_tag and version to provide conditional updates, and it is therefore recommended applications preserve both

e_tag instance-attribute

e_tag: str | None

The unique identifier for the newly created object.

datatracker.ietf.org/doc/html/rfc9110#name-etag

version instance-attribute

version: str | None

A version indicator for the newly created object.

obstore.PutMode module-attribute

PutMode = Literal['create', 'overwrite'] | UpdateVersion

Configure preconditions for the put operation

There are three modes:

  • Overwrite: Perform an atomic write operation, overwriting any object present at the provided path.
  • Create: Perform an atomic write operation, returning AlreadyExistsError if an object already exists at the provided path.
  • Update: Perform an atomic write operation if the current version of the object matches the provided UpdateVersion, returning PreconditionError otherwise.

If a string is provided, it must be one of:

  • "overwrite"
  • "create"

If a dict is provided, it must meet the criteria of UpdateVersion.