Benchmarking tile generation¶

This notebook walks through benchmarking performance of TiTiler-CMR for a given Earthdata CMR dataset.

In this notebook, you'll learn:

How to benchmark tile rendering performance across zoom levels
What factors impact tile generation performance in TiTiler-CMR.

In [1]:

Copied!





import earthaccess
from matplotlib.lines import Line2D
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from datacube_benchmark import (
    DatasetParams,
    benchmark_viewport,
    tiling_benchmark_summary,
)
import earthaccess
from matplotlib.lines import Line2D
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from datacube_benchmark import (
    DatasetParams,
    benchmark_viewport,
    tiling_benchmark_summary,
)

TiTiler-CMR¶

For this walkthrough, we will use the Titiler-CMR staging endpoint https://staging.openveda.cloud/api/titiler-cmr/.

Titiler-CMR supports two different backends:

xarray → for gridded/cloud-native datasets (e.g., NetCDF4/HDF5/GRIB), typically exposed as variables.
rasterio → for COG/raster imagery-style datasets exposed as bands (optionally via a regex).

Tip: Explore data granules with earthaccess

You can use earthaccess to search and inspect the individual granules used in your query. This helps you validate which files were accessed, their sizes, and the temporal range.

In [2]:

Copied!

concept_id = "C2723754864-GES_DISC"
time_range = ("2022-03-01T00:00:01Z", "2022-03-02T23:59:59Z")

# Authenticate if needed
earthaccess.login()  # or use "interactive" if needed

results = earthaccess.search_data(concept_id=concept_id, temporal=time_range)

print(f"Found {len(results)} granules between {time_range[0]} and {time_range[1]}")
concept_id = "C2723754864-GES_DISC"
time_range = ("2022-03-01T00:00:01Z", "2022-03-02T23:59:59Z")

# Authenticate if needed
earthaccess.login()  # or use "interactive" if needed

results = earthaccess.search_data(concept_id=concept_id, temporal=time_range)

print(f"Found {len(results)} granules between {time_range[0]} and {time_range[1]}")

Found 2 granules between 2022-03-01T00:00:01Z and 2022-03-02T23:59:59Z

Tile Generation Benchmarking¶

We are going to measure the tile generation performance across different zoom levels using titiler_cmr_benchmark.benchmark_viewport function. This function simulates the load of a typical viewport render in a slippy map, where multiple adjacent tiles must be fetched in parallel to draw a single view.

Step 1: Specify API Parameters¶

First, we have to define the parameters for the CMR dataset we want to benchmark. The DatasetParams class encapsulates all the necessary information to interact with a specific dataset via TiTiler-CMR.

Note this first example is for a dataset where each file has global coverage. This is important to evaluating the results of benchmarking.

In [3]:

Copied!





endpoint = "https://staging.openveda.cloud/api/titiler-cmr"

ds_xarray = DatasetParams(
    concept_id="C2723754864-GES_DISC",
    backend="xarray",
    datetime_range="2022-04-01T00:00:01Z/2022-04-02T23:59:59Z",
    variable="precipitation",
    step="P1D",
    temporal_mode="point",
)
endpoint = "https://staging.openveda.cloud/api/titiler-cmr"

ds_xarray = DatasetParams(
    concept_id="C2723754864-GES_DISC",
    backend="xarray",
    datetime_range="2022-04-01T00:00:01Z/2022-04-02T23:59:59Z",
    variable="precipitation",
    step="P1D",
    temporal_mode="point",
)

Step 2: Specifiy Zoom Levels¶

Zoom levels determine the detail and extent of the area being rendered. At lower zoom levels, a single tile covers a large spatial area and may intersect many granules. This usually translates to more I/O, more resampling/mosaic work, higher latency, and higher chance of timeouts errors.

As you increase the zoom level, each tile covers a smaller area, reducing the number of intersecting granules and the amount of work per request.

We'll define a range of zoom levels to test to see how performance varies.

In [4]:

Copied!





min_zoom = 3
max_zoom = 20

# Define the viewport parameters
viewport_width = 4
viewport_height = 4
lng = 25.0
lat = 29.0
min_zoom = 3
max_zoom = 20

# Define the viewport parameters
viewport_width = 4
viewport_height = 4
lng = 25.0
lat = 29.0

Step 3: Run the Benchmark¶

Now, let's run the benchmark across the specified zoom levels and visualize the results.

Under the hood, benchmark_viewport computes the center tile for each zoom level, selects its neighboring tiles to approximate a viewport, and requests them concurrently from the TiTiler-CMR endpoint. This function returns a pandas.DataFrame containing the response times for each tile request.

In [5]:

Copied!





df_viewport = await benchmark_viewport(
    endpoint=endpoint,
    dataset=ds_xarray,
    lng=lng,
    lat=lat,
    viewport_width=viewport_width,
    viewport_height=viewport_height,
    min_zoom=min_zoom,
    max_zoom=max_zoom,
    timeout_s=60.0,
)
df_viewport = await benchmark_viewport(
    endpoint=endpoint,
    dataset=ds_xarray,
    lng=lng,
    lat=lat,
    viewport_width=viewport_width,
    viewport_height=viewport_height,
    min_zoom=min_zoom,
    max_zoom=max_zoom,
    timeout_s=60.0,
)

=== TiTiler-CMR Tile Benchmark ===
Client: 2 physical / 4 logical cores | RAM: 15.62 GiB
Dataset: C2723754864-GES_DISC (xarray)
Query params: 8 parameters
  concept_id: C2723754864-GES_DISC
  backend: xarray
  datetime: 2022-04-01T00:00:01Z/2022-04-02T23:59:59Z
  variable: precipitation
  step: P1D
  temporal_mode: point
  tile_format: png
  tile_scale: 1

Total execution time: 23.072s

In [6]:

Copied!

df_viewport.head()
df_viewport.head()

Out[6]:

	zoom	x	y	status_code	ok	no_data	is_error	response_time_sec	content_type	response_size_bytes	url	error_text	total_run_elapsed_s
0	3	2	1	200	True	False	False	1.461705	image/png	694	https://staging.openveda.cloud/api/titiler-cmr...	None	23.071837
1	3	3	1	200	True	False	False	1.635859	image/png	694	https://staging.openveda.cloud/api/titiler-cmr...	None	23.071837
2	3	4	1	200	True	False	False	1.667570	image/png	694	https://staging.openveda.cloud/api/titiler-cmr...	None	23.071837
3	3	5	1	200	True	False	False	1.344237	image/png	694	https://staging.openveda.cloud/api/titiler-cmr...	None	23.071837
4	3	6	1	200	True	False	False	1.879269	image/png	694	https://staging.openveda.cloud/api/titiler-cmr...	None	23.071837

The output includes the following columns:

zoom, x, y — XYZ tile indices
status_code — HTTP code (200 = success, 204 = no-data, 4xx/5xx = errors)
response_time_sec — wall time in seconds
response_size_bytes — payload size
ok, is_error, has_data — convenience flags

Now, let's use a convenience function to summarize the benchmark results.

In [7]:

Copied!

df_summary = tiling_benchmark_summary(df_viewport)
df_summary.head()
df_summary = tiling_benchmark_summary(df_viewport)
df_summary.head()

Out[7]:

	zoom	n_tiles	ok_pct	median_latency_s	p95_latency_s
0	3	25.0	100.0	1.590145	1.885190
1	4	25.0	100.0	1.596415	1.993003
2	5	25.0	100.0	1.580377	1.997009
3	6	25.0	100.0	1.565605	1.922572
4	7	25.0	100.0	1.560524	1.712366

Step 4: Plot the results¶

You may notice there is little variation in performance across zoom levels. This collection's files have global extent. Regardless of the zoom level of the tile request, the same file or files must be opened and read. (Multiple files in the case that the datetime parameter returns multiple granules).

This situation is in contrast to the next example which uses files without global extent.

In [8]:

Copied!





def summarize_and_plot_tiles_from_df(
    df: pd.DataFrame,
    *,
    jitter=0.08,
    alpha=0.35,
    figsize=(9, 5),
    title_lines=None,
):
    """Generate summary and plot from tile benchmark DataFrame."""
    summary = tiling_benchmark_summary(df)

    fig, ax = plt.subplots(figsize=figsize)
    fig.subplots_adjust(right=0.72, top=0.80)

    zoom_levels = sorted(
        int(z) for z in pd.to_numeric(df["zoom"], errors="coerce").dropna().unique()
    )
    ax.set_xticks(zoom_levels)
    if zoom_levels:
        ax.set_xlim(min(zoom_levels) - 0.6, max(zoom_levels) + 0.6)

    for z in zoom_levels:
        sub = df[df["zoom"] == z]
        if sub.empty:
            continue

        x = np.random.normal(loc=z, scale=jitter, size=len(sub))
        ok_mask = sub["ok"].astype(bool).values
        err_mask = sub["is_error"].astype(bool).values

        ax.scatter(
            x[ok_mask],
            sub.loc[ok_mask, "response_time_sec"],
            alpha=alpha,
            edgecolor="none",
            label=None,
        )
        ax.scatter(
            x[err_mask],
            sub.loc[err_mask, "response_time_sec"],
            marker="x",
            alpha=min(0.85, alpha + 0.25),
            label=None,
        )

        med = pd.to_numeric(sub["response_time_sec"], errors="coerce").median()
        if np.isfinite(med):
            ax.hlines(med, z - 0.45, z + 0.45, linestyles="--")

    ax.set_xlabel("Zoom level")
    ax.set_ylabel("Tile response time (s)")

    ok_proxy = Line2D([], [], linestyle="none", marker="o", label="200 OK")
    err_proxy = Line2D(
        [], [], linestyle="none", marker="x", label="error (≥400 or failure)"
    )
    ax.legend(
        [ok_proxy, err_proxy],
        ["200 OK", "error"],
        frameon=False,
        loc="upper left",
        bbox_to_anchor=(1.02, 1.00),
    )

    if title_lines:
        ax.set_title("\n".join(title_lines), fontsize=9, loc="left", pad=12)

    ax.grid(True, axis="y", alpha=0.2)
    plt.tight_layout()

    return summary, (fig, ax)


summary, (fig, ax) = summarize_and_plot_tiles_from_df(
    df_viewport,
    title_lines=[
        "concept_id: C2723754864-GES_DISC",
        "endpoint: https://staging.openveda.cloud/api/titiler-cmr",
    ],
)
plt.show()
def summarize_and_plot_tiles_from_df(
    df: pd.DataFrame,
    *,
    jitter=0.08,
    alpha=0.35,
    figsize=(9, 5),
    title_lines=None,
):
    """Generate summary and plot from tile benchmark DataFrame."""
    summary = tiling_benchmark_summary(df)

    fig, ax = plt.subplots(figsize=figsize)
    fig.subplots_adjust(right=0.72, top=0.80)

    zoom_levels = sorted(
        int(z) for z in pd.to_numeric(df["zoom"], errors="coerce").dropna().unique()
    )
    ax.set_xticks(zoom_levels)
    if zoom_levels:
        ax.set_xlim(min(zoom_levels) - 0.6, max(zoom_levels) + 0.6)

    for z in zoom_levels:
        sub = df[df["zoom"] == z]
        if sub.empty:
            continue

        x = np.random.normal(loc=z, scale=jitter, size=len(sub))
        ok_mask = sub["ok"].astype(bool).values
        err_mask = sub["is_error"].astype(bool).values

        ax.scatter(
            x[ok_mask],
            sub.loc[ok_mask, "response_time_sec"],
            alpha=alpha,
            edgecolor="none",
            label=None,
        )
        ax.scatter(
            x[err_mask],
            sub.loc[err_mask, "response_time_sec"],
            marker="x",
            alpha=min(0.85, alpha + 0.25),
            label=None,
        )

        med = pd.to_numeric(sub["response_time_sec"], errors="coerce").median()
        if np.isfinite(med):
            ax.hlines(med, z - 0.45, z + 0.45, linestyles="--")

    ax.set_xlabel("Zoom level")
    ax.set_ylabel("Tile response time (s)")

    ok_proxy = Line2D([], [], linestyle="none", marker="o", label="200 OK")
    err_proxy = Line2D(
        [], [], linestyle="none", marker="x", label="error (≥400 or failure)"
    )
    ax.legend(
        [ok_proxy, err_proxy],
        ["200 OK", "error"],
        frameon=False,
        loc="upper left",
        bbox_to_anchor=(1.02, 1.00),
    )

    if title_lines:
        ax.set_title("\n".join(title_lines), fontsize=9, loc="left", pad=12)

    ax.grid(True, axis="y", alpha=0.2)
    plt.tight_layout()

    return summary, (fig, ax)


summary, (fig, ax) = summarize_and_plot_tiles_from_df(
    df_viewport,
    title_lines=[
        "concept_id: C2723754864-GES_DISC",
        "endpoint: https://staging.openveda.cloud/api/titiler-cmr",
    ],
)
plt.show()

No description has been provided for this image

HLS Example¶

In this example, we will benchmark a CMR dataset that is structured as Cloud-Optimized GeoTIFFs (COGs) with individual bands. We will use the rasterio backend for this dataset.

In contrast to the first example, HLS is much higher spatial resolution (30 meters vs 0.1 degrees)and each granule has a small spatial footprint. In general, the lower the zoom level (more zoomed out), the more files need to be opened to render a tile, which can lead to increased latency.

A more in-depth HLS benchmark report is provided in the Harmonized Landsat Sentinel 2 (HLS): tiling configuration and rendering performance documentation.

In [9]:

Copied!





ds_hls_day = DatasetParams(
    concept_id="C2021957295-LPCLOUD",
    backend="rasterio",
    datetime_range="2023-10-01T00:00:01Z/2023-10-07T00:00:01Z",
    bands=["B04", "B03", "B02"],
    bands_regex="B[0-9][0-9]",
    step="P1D",
    temporal_mode="point",
)
ds_hls_week = DatasetParams(
    concept_id="C2021957657-LPCLOUD",
    backend="rasterio",
    datetime_range="2023-10-01T00:00:01Z/2023-10-20T00:00:01Z",
    bands=["B04", "B03", "B02"],
    bands_regex="B[0-9][0-9]",
    step="P1W",
    temporal_mode="point",
)

min_zoom = 3
max_zoom = 20
viewport_width = 3
viewport_height = 3
timeout_s = 60.0
ds_hls_day = DatasetParams(
    concept_id="C2021957295-LPCLOUD",
    backend="rasterio",
    datetime_range="2023-10-01T00:00:01Z/2023-10-07T00:00:01Z",
    bands=["B04", "B03", "B02"],
    bands_regex="B[0-9][0-9]",
    step="P1D",
    temporal_mode="point",
)
ds_hls_week = DatasetParams(
    concept_id="C2021957657-LPCLOUD",
    backend="rasterio",
    datetime_range="2023-10-01T00:00:01Z/2023-10-20T00:00:01Z",
    bands=["B04", "B03", "B02"],
    bands_regex="B[0-9][0-9]",
    step="P1W",
    temporal_mode="point",
)

min_zoom = 3
max_zoom = 20
viewport_width = 3
viewport_height = 3
timeout_s = 60.0

In [10]:

Copied!





df_viewport_day = await benchmark_viewport(
    endpoint=endpoint,
    dataset=ds_hls_day,
    lng=lng,
    lat=lat,
    viewport_width=viewport_width,
    viewport_height=viewport_height,
    min_zoom=min_zoom,
    max_zoom=max_zoom,
    timeout_s=timeout_s,
)

df_viewport_day_summary = tiling_benchmark_summary(df_viewport_day)
df_viewport_day_summary.head()
df_viewport_day = await benchmark_viewport(
    endpoint=endpoint,
    dataset=ds_hls_day,
    lng=lng,
    lat=lat,
    viewport_width=viewport_width,
    viewport_height=viewport_height,
    min_zoom=min_zoom,
    max_zoom=max_zoom,
    timeout_s=timeout_s,
)

df_viewport_day_summary = tiling_benchmark_summary(df_viewport_day)
df_viewport_day_summary.head()

=== TiTiler-CMR Tile Benchmark ===
Client: 2 physical / 4 logical cores | RAM: 15.62 GiB
Dataset: C2021957295-LPCLOUD (rasterio)
Query params: 11 parameters
  concept_id: C2021957295-LPCLOUD
  backend: rasterio
  datetime: 2023-10-01T00:00:01Z/2023-10-07T00:00:01Z
  bands: B04
  bands: B03
  bands: B02
  bands_regex: B[0-9][0-9]
  step: P1D
  temporal_mode: point
  tile_format: png
  tile_scale: 1

Total execution time: 50.336s

Out[10]:

	zoom	n_tiles	ok_pct	median_latency_s	p95_latency_s
0	3	9.0	100.0	22.699576	40.584356
1	4	9.0	100.0	16.600212	19.499651
2	5	9.0	100.0	14.605976	17.154000
3	6	9.0	100.0	11.028084	15.306074
4	7	9.0	100.0	4.970010	6.005689

In [11]:

Copied!





df_viewport_week = await benchmark_viewport(
    endpoint=endpoint,
    dataset=ds_hls_week,
    lng=lng,
    lat=lat,
    viewport_width=viewport_width,
    viewport_height=viewport_height,
    min_zoom=min_zoom,
    max_zoom=max_zoom,
    timeout_s=timeout_s,
)

df_viewport_week_summary = tiling_benchmark_summary(df_viewport_week)
df_viewport_week_summary.head()
df_viewport_week = await benchmark_viewport(
    endpoint=endpoint,
    dataset=ds_hls_week,
    lng=lng,
    lat=lat,
    viewport_width=viewport_width,
    viewport_height=viewport_height,
    min_zoom=min_zoom,
    max_zoom=max_zoom,
    timeout_s=timeout_s,
)

df_viewport_week_summary = tiling_benchmark_summary(df_viewport_week)
df_viewport_week_summary.head()

=== TiTiler-CMR Tile Benchmark ===
Client: 2 physical / 4 logical cores | RAM: 15.62 GiB
Dataset: C2021957657-LPCLOUD (rasterio)
Query params: 11 parameters
  concept_id: C2021957657-LPCLOUD
  backend: rasterio
  datetime: 2023-10-01T00:00:01Z/2023-10-20T00:00:01Z
  bands: B04
  bands: B03
  bands: B02
  bands_regex: B[0-9][0-9]
  step: P1W
  temporal_mode: point
  tile_format: png
  tile_scale: 1

Total execution time: 68.305s

Out[11]:

	zoom	n_tiles	ok_pct	error_pct	median_latency_s	p95_latency_s
0	3	9.0	88.888889	11.111111	21.704861	23.464815
1	4	9.0	100.000000	0.000000	14.372156	15.745402
2	5	9.0	100.000000	0.000000	14.365140	16.641927
3	6	9.0	100.000000	0.000000	14.022378	17.720544
4	7	9.0	100.000000	0.000000	8.535669	9.593534

In [12]:

Copied!





summary, (fig, ax) = summarize_and_plot_tiles_from_df(
    df_viewport_day,
    title_lines=[
        "concept_id: C2036881735-POCLOUD",
        "Viewport: 3x3 tiles -- daily",
        "endpoint: https://staging.openveda.cloud/api/titiler-cmr",
    ],
)

plt.show()
summary, (fig, ax) = summarize_and_plot_tiles_from_df(
    df_viewport_day,
    title_lines=[
        "concept_id: C2036881735-POCLOUD",
        "Viewport: 3x3 tiles -- daily",
        "endpoint: https://staging.openveda.cloud/api/titiler-cmr",
    ],
)

plt.show()

In [13]:

Copied!





summary, (fig, ax) = summarize_and_plot_tiles_from_df(
    df_viewport_week,
    title_lines=[
        "concept_id: C2036881735-POCLOUD",
        "Viewport: 3x3 tiles -- weekly",
        "endpoint: https://staging.openveda.cloud/api/titiler-cmr",
    ],
)

plt.show()
summary, (fig, ax) = summarize_and_plot_tiles_from_df(
    df_viewport_week,
    title_lines=[
        "concept_id: C2036881735-POCLOUD",
        "Viewport: 3x3 tiles -- weekly",
        "endpoint: https://staging.openveda.cloud/api/titiler-cmr",
    ],
)

plt.show()

Conclusion¶

In this notebook, we explored how to check the performance of tile rendering performance in TiTiler-CMR using different datasets and backends. We observed how factors such as zoom levels, temporal intervals, and dataset structures impact the latency of tile requests.

In general, performance depends on:

zoom level and spatial resolution of the dataset
the width of the datetime interval and the temporal resolution of the dataset
how many granules intersect the tile footprint

Takeaways:

Consider specifying a minzoom and maximum datetime interval given a specific datasets temporal and spatial resolution.