TiTiler-CMR: Tile Benchmarking¶
This notebook walks you through a workflow to benchmark performance of a TiTiler-CMR deployment for a given Earthdata CMR dataset.
What is TiTiler-CMR?
TiTiler is a lightweight dynamic tiling server for raster/COG data. TiTiler-CMR is a variant/deployment that integrates with NASA's Common Metadata Repository (CMR) so you can render tiles directly from CMR-managed datasets (e.g., HDF5/NetCDF4/GRIB hosted on Earthdata Cloud). It can resolve a CMR concept ID to a renderable item, and expose tile and statistics endpoints without you needing to manually construct source URLs.
In this notebook, you'll learn:
- How to benchmark tile rendering performance across zoom levels
- What factors impact tile generation performance in TiTiler-CMR for different backends (xarray vs rasterio)
import asyncio
import pandas as pd
from datacube_benchmark.titiler import (
DatasetParams,
benchmark_viewport,
tiling_benchmark_summary,
)
TiTiler-CMR Setup¶
titiler-cmr
is a NASA-focused application that accepts Concept IDs and uses the Common Metadata Repository (CMR) to discover and serve associated granules as tiles. You can deploy your own instance of titiler_cmr
using the official guide, or use a public instance that is already deployed.
For this walkthrough, we will use the public instance hosted by Open VEDA.
To get started with a dataset, you need to:
- Choose a Titiler-CMR endpoint
- Pick a CMR dataset (by concept ID)
- Identify the assets/variables/bands you want to visualize
- Define a temporal interval (
start/end
ISO range) and, if needed, a time step (e.g., daily). - Select a backend that matches your dataset’s structure
Titiler-CMR supports two different backends:
- xarray → for gridded/cloud-native datasets (e.g., NetCDF4/HDF5/GRIB), typically exposed as variables.
- rasterio → for COG/raster imagery-style datasets exposed as bands (optionally via a regex).
Tip: Explore data granules with
earthaccess
You can use earthaccess
to search and inspect the individual granules used in your query. This helps you validate which files were accessed, their sizes, and the temporal range.
import earthaccess
concept_id = "C2723754864-GES_DISC"
time_range = ("2022-03-01T00:00:01Z", "2022-03-02T23:59:59Z")
# Authenticate if needed
earthaccess.login() # or use "interactive" if needed
results = earthaccess.search_data(
concept_id=concept_id,
temporal=time_range
)
print(f"Found {len(results)} granules between {time_range[0]} and {time_range[1]}")
Tile Generation Benchmarking¶
In this part, we are going to measure the tile generation performance across different zoom levels using titiler_cmr_benchmark.benchmark_viewport
function.
This function simulates the load of a typical viewport render in a slippy map, where multiple adjacent tiles must be fetched in parallel to draw a single view.
First, we have to define the parameters for the CMR dataset we want to benchmark. The DatasetParams
class encapsulates all the necessary information to interact with a specific dataset via TiTiler-CMR.
endpoint = "https://staging.openveda.cloud/api/titiler-cmr"
concept_id = "C2723754864-GES_DISC"
datetime_range = "2022-04-01T00:00:01Z/2022-04-02T23:59:59Z"
variable = "precipitation"
ds_xarray = DatasetParams(
concept_id="C2723754864-GES_DISC",
backend="xarray",
datetime_range="2022-03-01T00:00:01Z/2022-03-01T23:59:59Z",
variable="precipitation",
step="P1D",
temporal_mode="point",
)
Zoom Levels¶
Zoom levels determine the detail and extent of the area being rendered. At lower zoom levels, a single tile covers a large spatial area and may intersect many granules. This usually translates to more I/O, more resampling/mosaic work, higher latency, and higher chance of timeouts errors.
As you increase zoom, each tile covers a smaller area, reducing the number of intersecting granules and the amount of work per request.
We'll define a range of zoom levels to test to see how performance varies.
min_zoom = 3
max_zoom = 20
# Define the viewport parameters
viewport_width = 4
viewport_height = 4
lng = 25.0
lat = 29.0
Now, let's run the benchmark across the specified zoom levels and visualize the results.
Under the hood, titiler_cmr_benchmark.benchmark_viewport
computes the center tile for each zoom level, selects its neighboring tiles to approximate a viewport, and requests them concurrently from the TiTiler-CMR endpoint. This function returns a pandas DataFrame containing the response times for each tile request.
df_viewport = await benchmark_viewport(
endpoint=endpoint,
dataset=ds_xarray,
lng=lng,
lat=lat,
viewport_width=viewport_width,
viewport_height=viewport_height,
min_zoom=min_zoom,
max_zoom=max_zoom,
timeout_s=60.0,
)
=== TiTiler-CMR Tile Benchmark === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2723754864-GES_DISC (xarray) Query params: 8 parameters concept_id: C2723754864-GES_DISC backend: xarray datetime: 2022-03-01T00:00:01Z/2022-03-01T23:59:59Z variable: precipitation step: P1D temporal_mode: point tile_format: png tile_scale: 1 Total execution time: 19.153s
df_viewport.head()
zoom | x | y | status_code | ok | no_data | is_error | response_time_sec | content_type | response_size_bytes | url | error_text | total_run_elapsed_s | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 3 | 2 | 1 | 200 | True | False | False | 1.609666 | image/png | 694 | https://staging.openveda.cloud/api/titiler-cmr... | None | 19.153049 |
1 | 3 | 3 | 1 | 200 | True | False | False | 1.192089 | image/png | 694 | https://staging.openveda.cloud/api/titiler-cmr... | None | 19.153049 |
2 | 3 | 4 | 1 | 200 | True | False | False | 1.617420 | image/png | 694 | https://staging.openveda.cloud/api/titiler-cmr... | None | 19.153049 |
3 | 3 | 5 | 1 | 200 | True | False | False | 2.887875 | image/png | 694 | https://staging.openveda.cloud/api/titiler-cmr... | None | 19.153049 |
4 | 3 | 6 | 1 | 200 | True | False | False | 0.804066 | image/png | 694 | https://staging.openveda.cloud/api/titiler-cmr... | None | 19.153049 |
The output includes the following columns:
zoom, x, y
— XYZ tile indicesstatus_code
— HTTP code (200 = success, 204 = no-data, 4xx/5xx = errors)response_time_sec
— wall time in secondsresponse_size_bytes
— payload sizeok
,is_error, has_data
— convenience flags
Now, let's use a convenience function to summarize the benchmark results.
df_summary = tiling_benchmark_summary(df_viewport)
df_summary
zoom | n_tiles | ok_pct | no_data_pct | error_pct | median_latency_s | p95_latency_s | |
---|---|---|---|---|---|---|---|
0 | 3 | 25.0 | 100.0 | 0.0 | 0.0 | 1.192089 | 2.756978 |
1 | 4 | 25.0 | 100.0 | 0.0 | 0.0 | 1.182342 | 2.790999 |
2 | 5 | 25.0 | 100.0 | 0.0 | 0.0 | 1.372662 | 2.405504 |
3 | 6 | 25.0 | 100.0 | 0.0 | 0.0 | 1.008795 | 2.078795 |
4 | 7 | 25.0 | 100.0 | 0.0 | 0.0 | 0.953299 | 1.776905 |
5 | 8 | 25.0 | 100.0 | 0.0 | 0.0 | 1.199947 | 1.937273 |
6 | 9 | 25.0 | 100.0 | 0.0 | 0.0 | 1.224152 | 1.949105 |
7 | 10 | 25.0 | 100.0 | 0.0 | 0.0 | 0.883679 | 1.680721 |
8 | 11 | 25.0 | 100.0 | 0.0 | 0.0 | 1.131980 | 1.972159 |
9 | 12 | 25.0 | 100.0 | 0.0 | 0.0 | 1.503204 | 2.715302 |
10 | 13 | 25.0 | 100.0 | 0.0 | 0.0 | 1.321729 | 2.163595 |
11 | 14 | 25.0 | 100.0 | 0.0 | 0.0 | 1.275581 | 2.366395 |
12 | 15 | 25.0 | 100.0 | 0.0 | 0.0 | 1.202473 | 1.965626 |
13 | 16 | 25.0 | 100.0 | 0.0 | 0.0 | 1.086001 | 2.874105 |
14 | 17 | 25.0 | 100.0 | 0.0 | 0.0 | 1.022910 | 2.578667 |
15 | 18 | 25.0 | 100.0 | 0.0 | 0.0 | 1.042412 | 2.091209 |
16 | 19 | 25.0 | 100.0 | 0.0 | 0.0 | 1.076227 | 1.677191 |
17 | 20 | 25.0 | 100.0 | 0.0 | 0.0 | 1.083148 | 2.155265 |
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.lines import Line2D
def summarize_and_plot_tiles_from_df(
df: pd.DataFrame,
*,
jitter=0.08,
alpha=0.35,
figsize=(9, 5),
title_lines=None,
):
summary = tiling_benchmark_summary(df)
fig, ax = plt.subplots(figsize=figsize)
fig.subplots_adjust(right=0.72, top=0.80)
zoom_levels = sorted(
int(z) for z in pd.to_numeric(df["zoom"], errors="coerce").dropna().unique()
)
ax.set_xticks(zoom_levels)
if zoom_levels:
ax.set_xlim(min(zoom_levels) - 0.6, max(zoom_levels) + 0.6)
for z in zoom_levels:
sub = df[df["zoom"] == z]
if sub.empty:
continue
x = np.random.normal(loc=z, scale=jitter, size=len(sub))
ok_mask = sub["ok"].astype(bool).values
err_mask = sub["is_error"].astype(bool).values
ax.scatter(
x[ok_mask],
sub.loc[ok_mask, "response_time_sec"],
alpha=alpha,
edgecolor="none",
label=None,
)
ax.scatter(
x[err_mask],
sub.loc[err_mask, "response_time_sec"],
marker="x",
alpha=min(0.85, alpha + 0.25),
label=None,
)
med = pd.to_numeric(sub["response_time_sec"], errors="coerce").median()
if np.isfinite(med):
ax.hlines(med, z - 0.45, z + 0.45, linestyles="--")
ax.set_xlabel("Zoom level")
ax.set_ylabel("Tile response time (s)")
ok_proxy = Line2D([], [], linestyle="none", marker="o", label="200 OK")
err_proxy = Line2D(
[], [], linestyle="none", marker="x", label="error (≥400 or failure)"
)
ax.legend(
[ok_proxy, err_proxy],
["200 OK", "error"],
frameon=False,
loc="upper left",
bbox_to_anchor=(1.02, 1.00),
)
if title_lines:
ax.set_title("\n".join(title_lines), fontsize=9, loc="left", pad=12)
ax.grid(True, axis="y", alpha=0.2)
plt.tight_layout()
return summary, (fig, ax)
summary, (fig, ax) = summarize_and_plot_tiles_from_df(
df_viewport,
title_lines=[
"concept_id: C2723754864-GES_DISC",
"endpoint: https://staging.openveda.cloud/api/titiler-cmr",
],
)
plt.show()
Rasterio Backend (COG/Band-based datasets)¶
In this example, we will benchmark a CMR dataset that is structured as Cloud Optimized GeoTIFFs (COGs) with individual bands. We will use the rasterio
backend for this dataset.
In general, the lower the zoom level, the more files need to be opened to render a tile, which can lead to increased latency. Additionally, datasets with larger file sizes or more complex structures may also experience higher latency.
In Rasterio, each /tile
request:
- finds all granules intersecting the tile footprint and the selected datetime interval,
- reads & mosaics them (across space/time), resamples, stacks bands, then encodes the image
In contrast to the xarray backend, the rasterio backend’s tile latency depends strongly on the width of the datetime interval.
ds_hls_day = DatasetParams(
concept_id="C2021957295-LPCLOUD",
backend="rasterio",
datetime_range="2023-10-01T00:00:01Z/2023-10-07T00:00:01Z",
bands=["B04", "B03", "B02"],
bands_regex="B[0-9][0-9]",
step="P1D",
temporal_mode="point",
)
ds_hls_week = DatasetParams(
concept_id="C2021957657-LPCLOUD",
backend="rasterio",
datetime_range="2023-10-01T00:00:01Z/2023-10-20T00:00:01Z",
bands=["B04", "B03", "B02"],
bands_regex="B[0-9][0-9]",
step="P1W",
temporal_mode="point",
)
min_zoom = 3
max_zoom = 20
viewport_width = 3
viewport_height = 3
timeout_s = 60.0
df_viewport_day = await benchmark_viewport(
endpoint=endpoint,
dataset=ds_hls_day,
lng=lng,
lat=lat,
viewport_width=viewport_width,
viewport_height=viewport_height,
min_zoom=min_zoom,
max_zoom=max_zoom,
timeout_s=timeout_s,
)
df_viewport_day_summary = tiling_benchmark_summary(df_viewport_day)
df_viewport_day_summary
=== TiTiler-CMR Tile Benchmark === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2021957295-LPCLOUD (rasterio) Query params: 11 parameters concept_id: C2021957295-LPCLOUD backend: rasterio datetime: 2023-10-01T00:00:01Z/2023-10-07T00:00:01Z bands: B04 bands: B03 bands: B02 bands_regex: B[0-9][0-9] step: P1D temporal_mode: point tile_format: png tile_scale: 1 Total execution time: 60.123s
zoom | n_tiles | ok_pct | no_data_pct | error_pct | median_latency_s | p95_latency_s | |
---|---|---|---|---|---|---|---|
0 | 3 | 9.0 | 88.888889 | 0.0 | 11.111111 | 24.844026 | 43.565352 |
1 | 4 | 9.0 | 100.000000 | 0.0 | 0.000000 | 20.791528 | 22.357497 |
2 | 5 | 9.0 | 100.000000 | 0.0 | 0.000000 | 21.854107 | 24.161906 |
3 | 6 | 9.0 | 100.000000 | 0.0 | 0.000000 | 15.051093 | 20.070628 |
4 | 7 | 9.0 | 100.000000 | 0.0 | 0.000000 | 7.408420 | 9.914932 |
5 | 8 | 9.0 | 100.000000 | 0.0 | 0.000000 | 4.157204 | 6.375812 |
6 | 9 | 9.0 | 100.000000 | 0.0 | 0.000000 | 1.889051 | 3.172772 |
7 | 10 | 9.0 | 100.000000 | 0.0 | 0.000000 | 1.819688 | 2.697235 |
8 | 11 | 9.0 | 100.000000 | 0.0 | 0.000000 | 1.624994 | 2.344379 |
9 | 12 | 9.0 | 100.000000 | 0.0 | 0.000000 | 1.015991 | 1.567489 |
10 | 13 | 9.0 | 100.000000 | 0.0 | 0.000000 | 0.945573 | 1.447109 |
11 | 14 | 9.0 | 100.000000 | 0.0 | 0.000000 | 0.787210 | 1.866676 |
12 | 15 | 9.0 | 100.000000 | 0.0 | 0.000000 | 1.025955 | 1.546263 |
13 | 16 | 9.0 | 100.000000 | 0.0 | 0.000000 | 1.033864 | 1.738543 |
14 | 17 | 9.0 | 100.000000 | 0.0 | 0.000000 | 1.113492 | 1.945435 |
15 | 18 | 9.0 | 100.000000 | 0.0 | 0.000000 | 0.926164 | 1.602873 |
16 | 19 | 9.0 | 100.000000 | 0.0 | 0.000000 | 0.897090 | 1.091547 |
17 | 20 | 9.0 | 100.000000 | 0.0 | 0.000000 | 0.818822 | 1.040758 |
df_viewport_week = await benchmark_viewport(
endpoint=endpoint,
dataset=ds_hls_week,
lng=lng,
lat=lat,
viewport_width=viewport_width,
viewport_height=viewport_height,
min_zoom=min_zoom,
max_zoom=max_zoom,
timeout_s=timeout_s,
)
df_viewport_week_summary = tiling_benchmark_summary(df_viewport_week)
df_viewport_week_summary
=== TiTiler-CMR Tile Benchmark === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2021957657-LPCLOUD (rasterio) Query params: 11 parameters concept_id: C2021957657-LPCLOUD backend: rasterio datetime: 2023-10-01T00:00:01Z/2023-10-20T00:00:01Z bands: B04 bands: B03 bands: B02 bands_regex: B[0-9][0-9] step: P1W temporal_mode: point tile_format: png tile_scale: 1
summary, (fig, ax) = summarize_and_plot_tiles_from_df(
df_viewport_day,
title_lines=[
"concept_id: C2036881735-POCLOUD",
"Viewport: 3x3 tiles -- daily",
"endpoint: https://staging.openveda.cloud/api/titiler-cmr",
],
)
plt.show()
summary, (fig, ax) = summarize_and_plot_tiles_from_df(
df_viewport_week,
title_lines=[
"concept_id: C2036881735-POCLOUD",
"Viewport: 3x3 tiles -- weekly",
"endpoint: https://staging.openveda.cloud/api/titiler-cmr",
],
)
plt.show()
You can also run a similar check on a subset of data that has been defined by tiling.create_bbox_feature
Tiling Benchmark Over a custom bounds region¶
In this part, we are going to measure response latency across the tiles at different zoom levels using benchmark_titiler_cmr
function.
This function simulates the load of a typical viewport render in a slippy map, where multiple adjacent tiles must be fetched in parallel to draw a single view.
Under the hood, benchmark_titiler_cmr
computes the center tile for each zoom level, selects its neighboring tiles to approximate a viewport, and requests them concurrently from the TiTiler-CMR endpoint. This function returns a pandas DataFrame containing the response times for each tile request.
df_viewport = await benchmark_viewport(
endpoint=endpoint,
dataset=ds_xarray,
lng=-95.0,
lat=29.0,
viewport_width=3,
viewport_height=3,
min_zoom=7,
max_zoom=8,
timeout_s=60.0,
)
df_viewport.head()
=== TiTiler-CMR Tile Benchmark (Global Pool) === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2723754864-GES_DISC (xarray) Query params: 8 parameters concept_id: C2723754864-GES_DISC backend: xarray datetime: 2022-03-01T00:00:01Z/2022-03-01T23:59:59Z variable: precipitation step: P1D temporal_mode: point tile_format: png tile_scale: 1 Total execution time: 7.163s
zoom | x | y | status_code | ok | no_data | is_error | response_time_sec | content_type | response_size_bytes | url | error_text | total_run_elapsed_s | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 7 | 30 | 52 | 200 | True | False | False | 1.343241 | image/png | 694 | https://staging.openveda.cloud/api/titiler-cmr... | None | 7.162834 |
1 | 7 | 31 | 52 | 200 | True | False | False | 2.334120 | image/png | 694 | https://staging.openveda.cloud/api/titiler-cmr... | None | 7.162834 |
2 | 7 | 29 | 53 | 200 | True | False | False | 1.638145 | image/png | 694 | https://staging.openveda.cloud/api/titiler-cmr... | None | 7.162834 |
3 | 7 | 30 | 53 | 200 | True | False | False | 1.863235 | image/png | 694 | https://staging.openveda.cloud/api/titiler-cmr... | None | 7.162834 |
4 | 7 | 31 | 53 | 200 | True | False | False | 1.424258 | image/png | 694 | https://staging.openveda.cloud/api/titiler-cmr... | None | 7.162834 |
Band Combinations¶
In Rasterio backend, you can specify multiple bands to be rendered in a single tile request. This is useful for visualizing different aspects of the data, such as true color composites or vegetation indices.
More bands typically mean larger payloads and potentially higher latency, especially if the bands are stored in separate files.
# Configure zooms and interval
min_zoom = 5
max_zoom = 15
zoom_levels = list(range(min_zoom, max_zoom + 1))
start = "2023-01-01T00:00:00Z"
end = "2023-01-07T23:59:59Z"
# Band sets to compare
asset_sets = {
"1 band": ["B04"],
"2 bands": ["B04", "B03"],
"3 bands": ["B04", "B03", "B02"],
}
tasks = []
labels = []
for label, assets in asset_sets.items():
ds = DatasetParams(
concept_id=concept_id,
backend="rasterio",
datetime_range=f"{start}/{end}",
bands=assets,
bands_regex="B[0-9][0-9]",
)
tasks.append(
benchmark_viewport(
endpoint=endpoint,
dataset=ds,
lng=lng,
lat=lat,
min_zoom=min_zoom,
max_zoom=max_zoom,
viewport_width=7,
viewport_height=7,
timeout_s=timeout_s,
)
)
labels.append(label)
dfs = await asyncio.gather(*tasks)
=== TiTiler-CMR Tile Benchmark (Global Pool) === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2723754864-GES_DISC (rasterio) Query params: 7 parameters concept_id: C2723754864-GES_DISC backend: rasterio datetime: 2023-01-01T00:00:00Z/2023-01-07T23:59:59Z bands: B04 bands_regex: B[0-9][0-9] tile_format: png tile_scale: 1 === TiTiler-CMR Tile Benchmark (Global Pool) === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2723754864-GES_DISC (rasterio) Query params: 8 parameters concept_id: C2723754864-GES_DISC backend: rasterio datetime: 2023-01-01T00:00:00Z/2023-01-07T23:59:59Z bands: B04 bands: B03 bands_regex: B[0-9][0-9] tile_format: png tile_scale: 1 === TiTiler-CMR Tile Benchmark (Global Pool) === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2723754864-GES_DISC (rasterio) Query params: 9 parameters concept_id: C2723754864-GES_DISC backend: rasterio datetime: 2023-01-01T00:00:00Z/2023-01-07T23:59:59Z bands: B04 bands: B03 bands: B02 bands_regex: B[0-9][0-9] tile_format: png tile_scale: 1 Total execution time: 26.082s Total execution time: 27.484s Total execution time: 28.554s
median_by_zoom = []
for df in dfs:
# New schema: 'zoom' and 'response_time_sec'
s = df.groupby("zoom")["response_time_sec"].median().reindex(zoom_levels)
median_by_zoom.append(s)
panel_df = pd.concat(median_by_zoom, axis=1)
panel_df.columns = labels
panel_df
1 band | 2 bands | 3 bands | |
---|---|---|---|
zoom | |||
5 | 1.392389 | 0.930824 | 2.119112 |
6 | 1.358718 | 0.919507 | 1.673362 |
7 | 1.663322 | 0.974480 | 1.580107 |
8 | 1.313399 | 1.446751 | 1.333293 |
9 | 1.059130 | 1.242408 | 1.002137 |
10 | 0.828156 | 1.026818 | 0.891187 |
11 | 1.018971 | 1.143670 | 0.966605 |
12 | 1.212874 | 1.284740 | 1.133225 |
13 | 1.313594 | 0.948578 | 1.512228 |
14 | 1.218939 | 1.200256 | 1.398682 |
15 | 1.175385 | 1.415249 | 1.177417 |
# --- plot all three lines together ---
fig, ax = plt.subplots(figsize=(6, 5))
for col in panel_df.columns:
ax.plot(zoom_levels, panel_df[col].values, marker="o", linewidth=2, label=col)
ax.set_xticks(zoom_levels) # exact zoom values
ax.set_xlabel("zoom Level")
ax.set_ylabel("Response Time (s)")
ax.grid(True, alpha=0.25)
fig.subplots_adjust(right=0.78)
ax.legend(frameon=False, loc="best")
plt.tight_layout()
plt.show()
Conclusion¶
In this notebook, we explored how to check the performance of tile rendering performance in TiTiler-CMR using different datasets and backends. We observed how factors such as zoom levels, temporal intervals, and dataset structures impact the latency of tile requests.
In general, Xarray backend:
- Performance depends strongly on the zoom levels,
- Reads a single timestep for
/tile
requests so interval width generally does not change tile latency.
In Raterio backend:
- Covers all the granules intersecting the tile footprint and the selected datetime interval,
- Performance depends on zoom levels and the width of the datetime interval, and band selection
- Higher zoom levels (e.g., z > 8) tend to have more stable and lower latency due to fewer intersecting granules. However, performance plateaus around z≈9 for many datasets.
Takeaways:
- Prefer single-day (or narrow) intervals for responsive rendering
- The bigger the time range, the more data needs to be scanned and processed
- Avoid very low zooms for heavy composites; consider minzoom ≥ 7
Further Reading¶