Performance¶

Last edited 2025-02-05.

Performance is a primary goal of Obstore. Benchmarking is still ongoing, so this document is a mix of what we've learned so far and our untested expectations.

tl;dr: Obstore can't magically make your networking hardware faster, but it can reduce overhead, and in cases where that overhead is the limiting factor it can better utilize your available hardware and improve performance.

Non-performance benefits¶

Before we get into the weeds of performance, keep in mind that performance is not the only feature of Obstore. There's a strong focus on developer experience as well:

Simple to install with no required Python dependencies.
Works the same across AWS S3, Google Cloud Storage, and Azure Storage.
Full type hinting, including all store configuration and operations.
Downloads that automatically act as iterators and uploads that automatically accept iterators.
Automatic pagination of list calls behind the scenes

So you might enjoy using Obstore even in a case where it only marginally improves your performance.

Defining performance¶

"Fast" can have several definitions in a networking context.

Download latency: the time until the first byte of a download is received.
Single-request throughput: the download or upload bytes per second of a single request.
Many-request throughput: the combined download or upload bytes per second of multiple concurrent requests.

Furthermore, performance can be different when using obstore's synchronous or asynchronous API.

Let's consider the areas where we expect improved, possibly-improved, and equal performance.

Improved performance¶

Many-request throughput with the asynchronous API is the primary place where we expect significantly improved performance. Especially when making many requests of relatively small files, we find that obstore can provide significantly higher throughput.

For example, preliminary results indicate roughly 9x higher throughput than fsspec and 2.8x higher throughput than aioboto3. That specific benchmark considered fetching the first 16KB of a file many times from an async context.

Possibly improved performance¶

Using the synchronous API. We haven't benchmarked the synchronous API. However, we do release the Python Global Interpreter Lock (GIL) for all synchronous operations, so it may perform better in a thread pool than other Python request libraries.

Equal performance¶

Single-request throughput: if you're making one request, the limiting factor is likely network conditions, not Python overhead, so it's unlikely that obstore will be faster.

Keep in mind, however, that what looks like a single request may actually be multiple requests under the hood. obstore.put will use multipart uploads by default, meaning that various parts of a file will be uploaded concurrently, and there may be efficiency gains here. - Latency: this is primarily driven by hardware and network conditions, and we expect Obstore to have similar latency as other Python request libraries.

Future research¶

In the future, we'd like to benchmark:

Alternate Python event loops, e.g. uvloop
The obstore synchronous API

If you have any interest in collaborating on this, open an issue.