Skip to content

Alternatives to Obstore

Obstore vs fsspec

Fsspec is a generic specification for pythonic filesystems. It includes implementations for several cloud storage providers, including s3fs for Amazon S3, gcsfs for Google Cloud Storage, and adlfs for Azure Storage.

API Differences

Like Obstore, fsspec presents an abstraction layer that allows you to write code once to interface to multiple cloud providers. However, the abstracted API each presents is different. Obstore tries to mirror native object store APIs while fsspec tries to mirror a file-like API.

The upstream Rust library powering obstore, object_store, documents why it intentionally avoids a primary file-like API:

The ObjectStore interface is designed to mirror the APIs of object stores and not filesystems, and thus has stateless APIs instead of cursor based interfaces such as Read or Seek available in filesystems.

This design provides the following advantages:

  • All operations are atomic, and readers cannot observe partial and/or failed writes
  • Methods map directly to object store APIs, providing both efficiency and predictability
  • Abstracts away filesystem and operating system specific quirks, ensuring portability
  • Allows for functionality not native to filesystems, such as operation preconditions and atomic multipart uploads

Obstore's primary APIs, like get, put, and list, mirror such object store APIs. However, if you still need to use a file-like API, Obstore provides such APIs with open_reader and open_writer.

Obstore also includes a best-effort fsspec compatibility layer, which allows you to use obstore in applications that expect an fsspec-compatible API.

Performance

Beyond API design, performance can also be a consideration. Initial benchmarks show that obstore's async API can provide 9x higher throughput than fsspec's async API.

Obstore vs boto3

boto3 is the official Python client for working with AWS services, including S3.

boto3 supports all features of S3, including some features that obstore doesn't provide, like creating or deleting buckets.

However, boto3 is synchronous and specific to AWS. To support multiple clouds you'd need to use boto3 and another library and abstract away those differences yourself. With obstore you can interface with data in multiple clouds, changing only configuration settings.

Obstore vs aioboto3

aioboto3 is an async Python client for S3, wrapping boto3 and aiobotocore.

aioboto3 presents largely the same API as boto3, but async. As above, this means that it may support more S3-specific features than what obstore supports.

But it's still specific to AWS, and in early benchmarks we've measured obstore to provide significantly higher throughput than aioboto3.

Obstore vs Google Cloud Storage Python Client

The official Google Cloud Storage Python client uses requests as its HTTP client. This means that the GCS Python client supports only synchronous requests.

It also presents a Google-specific API, so you'd need to re-implement your code if you want to use multiple cloud providers.