Skip to content

Microsoft Azure

obstore.store.AzureStore

Interface to a Microsoft Azure Blob Storage container.

All constructors will check for environment variables. Refer to AzureConfig for valid environment variables.

client_options property

client_options: ClientConfig | None

Get the store's client configuration.

config property

config: AzureConfig

Get the underlying Azure config parameters.

credential_provider property

credential_provider: AzureCredentialProvider | None

Get the store's credential provider.

prefix property

prefix: str | None

Get the prefix applied to all operations in this store, if any.

retry_config property

retry_config: RetryConfig | None

Get the store's retry configuration.

__init__

__init__(
    container_name: str | None = None,
    *,
    prefix: str | None = None,
    config: AzureConfig | None = None,
    client_options: ClientConfig | None = None,
    retry_config: RetryConfig | None = None,
    credential_provider: AzureCredentialProvider | None = None,
    **kwargs: Unpack[AzureConfig],
) -> None

Construct a new AzureStore.

Parameters:

  • container_name (str | None, default: None ) –

    the name of the container.

Other Parameters:

  • prefix (str | None) –

    A prefix within the bucket to use for all operations.

  • config (AzureConfig | None) –

    Azure Configuration. Values in this config will override values inferred from the url. Defaults to None.

  • client_options (ClientConfig | None) –

    HTTP Client options. Defaults to None.

  • retry_config (RetryConfig | None) –

    Retry configuration. Defaults to None.

  • credential_provider (AzureCredentialProvider | None) –

    A callback to provide custom Azure credentials.

  • kwargs (Unpack[AzureConfig]) –

    Azure configuration values. Supports the same values as config, but as named keyword args.

Returns:

  • None

    AzureStore

copy

copy(from_: str, to: str, *, overwrite: bool = True) -> None

Copy an object from one path to another in the same object store.

Refer to the documentation for copy.

copy_async async

copy_async(from_: str, to: str, *, overwrite: bool = True) -> None

Call copy asynchronously.

Refer to the documentation for copy.

delete

delete(paths: str | Sequence[str]) -> None

Delete the object at the specified location(s).

Refer to the documentation for delete.

delete_async async

delete_async(paths: str | Sequence[str]) -> None

Call delete asynchronously.

Refer to the documentation for delete.

from_url classmethod

from_url(
    url: str,
    *,
    prefix: str | None = None,
    config: AzureConfig | None = None,
    client_options: ClientConfig | None = None,
    retry_config: RetryConfig | None = None,
    credential_provider: AzureCredentialProvider | None = None,
    **kwargs: Unpack[AzureConfig],
) -> Self

Construct a new AzureStore with values populated from a well-known storage URL.

The supported url schemes are:

  • abfs[s]://<container>/<path> (according to fsspec)
  • abfs[s]://<file_system>@<account_name>.dfs.core.windows.net/<path>
  • abfs[s]://<file_system>@<account_name>.dfs.fabric.microsoft.com/<path>
  • az://<container>/<path> (according to fsspec)
  • adl://<container>/<path> (according to fsspec)
  • azure://<container>/<path> (custom)
  • https://<account>.dfs.core.windows.net
  • https://<account>.blob.core.windows.net
  • https://<account>.blob.core.windows.net/<container>
  • https://<account>.dfs.fabric.microsoft.com
  • https://<account>.dfs.fabric.microsoft.com/<container>
  • https://<account>.blob.fabric.microsoft.com
  • https://<account>.blob.fabric.microsoft.com/<container>

Parameters:

  • url (str) –

    well-known storage URL.

Other Parameters:

  • prefix (str | None) –

    A prefix within the bucket to use for all operations.

  • config (AzureConfig | None) –

    Azure Configuration. Values in this config will override values inferred from the url. Defaults to None.

  • client_options (ClientConfig | None) –

    HTTP Client options. Defaults to None.

  • retry_config (RetryConfig | None) –

    Retry configuration. Defaults to None.

  • credential_provider (AzureCredentialProvider | None) –

    A callback to provide custom Azure credentials.

  • kwargs (Unpack[AzureConfig]) –

    Azure configuration values. Supports the same values as config, but as named keyword args.

Returns:

  • Self

    AzureStore

get

get(path: str, *, options: GetOptions | None = None) -> GetResult

Return the bytes that are stored at the specified location.

Refer to the documentation for get.

get_async async

get_async(path: str, *, options: GetOptions | None = None) -> GetResult

Call get asynchronously.

Refer to the documentation for get.

get_range

get_range(
    path: str, *, start: int, end: int | None = None, length: int | None = None
) -> Bytes

Return the bytes stored at the specified location in the given byte range.

Refer to the documentation for get_range.

get_range_async async

get_range_async(
    path: str, *, start: int, end: int | None = None, length: int | None = None
) -> Bytes

Call get_range asynchronously.

Refer to the documentation for get_range.

get_ranges

get_ranges(
    path: str,
    *,
    starts: Sequence[int],
    ends: Sequence[int] | None = None,
    lengths: Sequence[int] | None = None,
) -> list[Bytes]

Return the bytes stored at the specified location in the given byte ranges.

Refer to the documentation for get_ranges.

get_ranges_async async

get_ranges_async(
    path: str,
    *,
    starts: Sequence[int],
    ends: Sequence[int] | None = None,
    lengths: Sequence[int] | None = None,
) -> list[Bytes]

Call get_ranges asynchronously.

Refer to the documentation for get_ranges.

head

head(path: str) -> ObjectMeta

Return the metadata for the specified location.

Refer to the documentation for head.

head_async async

head_async(path: str) -> ObjectMeta

Call head asynchronously.

Refer to the documentation for head_async.

list

list(
    prefix: str | None = None,
    *,
    offset: str | None = None,
    chunk_size: int = 50,
    return_arrow: Literal[True],
) -> ListStream[RecordBatch]
list(
    prefix: str | None = None,
    *,
    offset: str | None = None,
    chunk_size: int = 50,
    return_arrow: Literal[False] = False,
) -> ListStream[list[ObjectMeta]]
list(
    prefix: str | None = None,
    *,
    offset: str | None = None,
    chunk_size: int = 50,
    return_arrow: bool = False,
) -> ListStream[RecordBatch] | ListStream[list[ObjectMeta]]

List all the objects with the given prefix.

Refer to the documentation for list.

list_with_delimiter

list_with_delimiter(
    prefix: str | None = None, *, return_arrow: Literal[True]
) -> ListResult[Table]
list_with_delimiter(
    prefix: str | None = None, *, return_arrow: Literal[False] = False
) -> ListResult[list[ObjectMeta]]
list_with_delimiter(
    prefix: str | None = None, *, return_arrow: bool = False
) -> ListResult[Table] | ListResult[list[ObjectMeta]]

List objects with the given prefix and an implementation specific delimiter.

Refer to the documentation for list_with_delimiter.

list_with_delimiter_async async

list_with_delimiter_async(
    prefix: str | None = None, *, return_arrow: Literal[True]
) -> ListResult[Table]
list_with_delimiter_async(
    prefix: str | None = None, *, return_arrow: Literal[False] = False
) -> ListResult[list[ObjectMeta]]
list_with_delimiter_async(
    prefix: str | None = None, *, return_arrow: bool = False
) -> ListResult[Table] | ListResult[list[ObjectMeta]]

Call list_with_delimiter asynchronously.

Refer to the documentation for list_with_delimiter.

put

put(
    path: str,
    file: IO[bytes]
    | Path
    | bytes
    | Buffer
    | Iterator[Buffer]
    | Iterable[Buffer],
    *,
    attributes: Attributes | None = None,
    tags: dict[str, str] | None = None,
    mode: PutMode | None = None,
    use_multipart: bool | None = None,
    chunk_size: int = 5 * 1024 * 1024,
    max_concurrency: int = 12,
) -> PutResult

Save the provided bytes to the specified location.

Refer to the documentation for put.

put_async async

put_async(
    path: str,
    file: IO[bytes]
    | Path
    | bytes
    | Buffer
    | AsyncIterator[Buffer]
    | AsyncIterable[Buffer]
    | Iterator[Buffer]
    | Iterable[Buffer],
    *,
    attributes: Attributes | None = None,
    tags: dict[str, str] | None = None,
    mode: PutMode | None = None,
    use_multipart: bool | None = None,
    chunk_size: int = 5 * 1024 * 1024,
    max_concurrency: int = 12,
) -> PutResult

Call put asynchronously.

Refer to the documentation for put. In addition to what the synchronous put allows for the file parameter, this also supports an async iterator or iterable of objects implementing the Python buffer protocol.

This means, for example, you can pass the result of get_async directly to put_async, and the request will be streamed through Python during the put operation:

import obstore as obs

# This only constructs the stream, it doesn't materialize the data in memory
resp = await obs.get_async(store1, path1)
# A streaming upload is created to copy the file to path2
await obs.put_async(store2, path2)

rename

rename(from_: str, to: str, *, overwrite: bool = True) -> None

Move an object from one path to another in the same object store.

Refer to the documentation for rename.

rename_async async

rename_async(from_: str, to: str, *, overwrite: bool = True) -> None

Call rename asynchronously.

Refer to the documentation for rename.

obstore.store.AzureConfig

Bases: TypedDict

Configuration parameters for AzureStore.

Not importable at runtime

To use this type hint in your code, import it within a TYPE_CHECKING block:

from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
    from obstore.store import AzureConfig

account_key instance-attribute

account_key: str

Master key for accessing storage account.

Environment variables:

  • AZURE_STORAGE_ACCOUNT_KEY
  • AZURE_STORAGE_ACCESS_KEY
  • AZURE_STORAGE_MASTER_KEY

account_name instance-attribute

account_name: str

The name of the azure storage account. (Required.)

Environment variable: AZURE_STORAGE_ACCOUNT_NAME.

authority_host instance-attribute

authority_host: str

Sets an alternative authority host for OAuth based authorization.

Defaults to https://login.microsoftonline.com.

Common hosts for azure clouds are:

  • Azure China: "https://login.chinacloudapi.cn"
  • Azure Germany: "https://login.microsoftonline.de"
  • Azure Government: "https://login.microsoftonline.us"
  • Azure Public: "https://login.microsoftonline.com"

Environment variables:

  • AZURE_STORAGE_AUTHORITY_HOST
  • AZURE_AUTHORITY_HOST

client_id instance-attribute

client_id: str

The client id for use in client secret or k8s federated credential flow.

Environment variables:

  • AZURE_STORAGE_CLIENT_ID
  • AZURE_CLIENT_ID

client_secret instance-attribute

client_secret: str

The client secret for use in client secret flow.

Environment variables:

  • AZURE_STORAGE_CLIENT_SECRET
  • AZURE_CLIENT_SECRET

container_name instance-attribute

container_name: str

Container name.

Environment variable: AZURE_CONTAINER_NAME.

disable_tagging instance-attribute

disable_tagging: bool

If set to True will ignore any tags provided to uploads.

Environment variable: AZURE_DISABLE_TAGGING.

endpoint instance-attribute

endpoint: str

Override the endpoint used to communicate with blob storage.

Defaults to https://{account}.blob.core.windows.net.

By default, only HTTPS schemes are enabled. To connect to an HTTP endpoint, enable allow_http in the client options.

Environment variables:

  • AZURE_STORAGE_ENDPOINT
  • AZURE_ENDPOINT

fabric_cluster_identifier instance-attribute

fabric_cluster_identifier: str

Cluster identifier for Fabric OAuth2 authentication.

Environment variable: AZURE_FABRIC_CLUSTER_IDENTIFIER.

fabric_session_token instance-attribute

fabric_session_token: str

Session token for Fabric OAuth2 authentication.

Environment variable: AZURE_FABRIC_SESSION_TOKEN.

fabric_token_service_url instance-attribute

fabric_token_service_url: str

Service URL for Fabric OAuth2 authentication.

Environment variable: AZURE_FABRIC_TOKEN_SERVICE_URL.

fabric_workload_host instance-attribute

fabric_workload_host: str

Workload host for Fabric OAuth2 authentication.

Environment variable: AZURE_FABRIC_WORKLOAD_HOST.

federated_token_file instance-attribute

federated_token_file: str

Sets a file path for acquiring azure federated identity token in k8s.

Requires client_id and tenant_id to be set.

Environment variable: AZURE_FEDERATED_TOKEN_FILE.

msi_endpoint instance-attribute

msi_endpoint: str

Endpoint to request a imds managed identity token.

Environment variables:

  • AZURE_MSI_ENDPOINT
  • AZURE_IDENTITY_ENDPOINT

msi_resource_id instance-attribute

msi_resource_id: str

Msi resource id for use with managed identity authentication.

Environment variable: AZURE_MSI_RESOURCE_ID.

object_id instance-attribute

object_id: str

Object id for use with managed identity authentication.

Environment variable: AZURE_OBJECT_ID.

sas_key instance-attribute

sas_key: str

Shared access signature.

The signature is expected to be percent-encoded, muchlike they are provided in the azure storage explorer or azure portal.

Environment variables:

  • AZURE_STORAGE_SAS_KEY
  • AZURE_STORAGE_SAS_TOKEN

skip_signature instance-attribute

skip_signature: bool

If enabled, AzureStore will not fetch credentials and will not sign requests.

This can be useful when interacting with public containers.

Environment variable: AZURE_SKIP_SIGNATURE.

tenant_id instance-attribute

tenant_id: str

The tenant id for use in client secret or k8s federated credential flow.

Environment variables:

  • AZURE_STORAGE_TENANT_ID
  • AZURE_STORAGE_AUTHORITY_ID
  • AZURE_TENANT_ID
  • AZURE_AUTHORITY_ID

token instance-attribute

token: str

A static bearer token to be used for authorizing requests.

Environment variable: AZURE_STORAGE_TOKEN.

use_azure_cli instance-attribute

use_azure_cli: bool

Set if the Azure Cli should be used for acquiring access token.

learn.microsoft.com/en-us/cli/azure/account?view=azure-cli-latest#az-account-get-access-token.

Environment variable: AZURE_USE_AZURE_CLI.

use_emulator instance-attribute

use_emulator: bool

Set if the Azure emulator should be used (defaults to False).

Environment variable: AZURE_STORAGE_USE_EMULATOR.

use_fabric_endpoint instance-attribute

use_fabric_endpoint: bool

Set if Microsoft Fabric url scheme should be used (defaults to False).

When disabled the url scheme used is https://{account}.blob.core.windows.net. When enabled the url scheme used is https://{account}.dfs.fabric.microsoft.com.

Note

endpoint will take precedence over this option.

obstore.store.AzureAccessKey

Bases: TypedDict

A shared Azure Storage Account Key.

learn.microsoft.com/en-us/rest/api/storageservices/authorize-with-shared-key

Not importable at runtime

To use this type hint in your code, import it within a TYPE_CHECKING block:

from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
    from obstore.store import AzureAccessKey

access_key instance-attribute

access_key: str

Access key value.

expires_at instance-attribute

expires_at: datetime | None

Expiry datetime of credential. The datetime should have time zone set.

If None, the credential will never expire.

obstore.store.AzureSASToken

Bases: TypedDict

A shared access signature.

learn.microsoft.com/en-us/rest/api/storageservices/delegate-access-with-shared-access-signature

Not importable at runtime

To use this type hint in your code, import it within a TYPE_CHECKING block:

from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
    from obstore.store import AzureSASToken

expires_at instance-attribute

expires_at: datetime | None

Expiry datetime of credential. The datetime should have time zone set.

If None, the credential will never expire.

sas_token instance-attribute

sas_token: str | list[tuple[str, str]]

SAS token.

obstore.store.AzureBearerToken

Bases: TypedDict

An authorization token.

learn.microsoft.com/en-us/rest/api/storageservices/authorize-with-azure-active-directory

Not importable at runtime

To use this type hint in your code, import it within a TYPE_CHECKING block:

from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
    from obstore.store import AzureBearerToken

expires_at instance-attribute

expires_at: datetime | None

Expiry datetime of credential. The datetime should have time zone set.

If None, the credential will never expire.

token instance-attribute

token: str

Bearer token.

obstore.store.AzureCredential module-attribute

AzureCredential: TypeAlias = AzureAccessKey | AzureSASToken | AzureBearerToken

A type alias for supported azure credentials to be returned from AzureCredentialProvider.

Not importable at runtime

To use this type hint in your code, import it within a TYPE_CHECKING block:

from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
    from obstore.store import AzureCredential

obstore.store.AzureCredentialProvider

Bases: Protocol

A type hint for a synchronous or asynchronous callback to provide custom Azure credentials.

This should be passed into the credential_provider parameter of AzureStore.

Not importable at runtime

To use this type hint in your code, import it within a TYPE_CHECKING block:

from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
    from obstore.store import AzureCredentialProvider

__call__ staticmethod

Return an AzureCredential.