TiTiler-CMR: Statistics Benchmarking¶
This notebook shows how to benchmark the /timeseries/statistics
endpoint of a TiTiler-CMR deployment and understand how performance varies under different parameters.
In Titiler-CMR, the /timeseries/statistics
endpoint computes statistics for all points/intervals along a timeseries and over a specified geometry. The performance of this endpoint can vary based on several factors that we will explore in this notebook.
In this notebook, you'll learn:
- How to benchmark the
/timeseries/statistics
endpoint across different parameters - What factors impact the performance of the
/timeseries/statistics
endpoint in TiTiler-CMR - Tips on how to use the endpoint effectively to avoid any timeouts or performance issues
import pandas as pd
import json
from datacube_benchmark.titiler import (
DatasetParams,
benchmark_statistics,
create_bbox_feature,
)
endpoint = "https://staging.openveda.cloud/api/titiler-cmr"
Introduction¶
The /timeseries/statistics
endpoint will produce summary statistics for an AOI for all points along a timeseries. This typically involves reading multiple granules, performing reprojection/resampling/mosaicking, and then computing statistics over the specified area of interest .
This endpoint returns a GeoJSON FeatureCollection with statistics for each time point in the timeseries.
The performance of this endpoint can vary based on several factors, including:
- The size and complexity of the geometry (e.g., a small polygon vs a large bounding box)
- The number of granules that need to be read and processed to cover the geometry
- The length of the time series (i.e., how many time points i.e. granules)
We want to define the parameters for the CMR dataset we want to benchmark. The DatasetParams
class encapsulates all the necessary information to interact with a specific dataset via TiTiler-CMR.
concept_id = "C2036881735-POCLOUD"
backend = "xarray"
datetime_range = "2022-03-01T00:00:01Z/2022-03-01T23:59:59Z"
variable = "analysed_sst"
step = "P1D"
temporal_mode = "point"
ds_xarray = DatasetParams(
concept_id=concept_id,
backend=backend,
datetime_range=datetime_range,
variable=variable,
step=step,
temporal_mode=temporal_mode,
)
concept_id = "C2021957657-LPCLOUD"
backend = "rasterio"
datetime_range = "2022-03-01T00:00:01Z/2022-03-01T23:59:59Z"
bands_regex = ("B[0-9][0-9]",)
bands = (["B04", "B03", "B02"],)
step = "P1D"
temporal_mode = "point"
ds_rasterio = DatasetParams(
concept_id=concept_id,
backend=backend,
datetime_range=datetime_range,
bands_regex=bands_regex,
bands=bands,
step=step,
temporal_mode=temporal_mode,
)
GeoJson Feature¶
The /timeseries/statistics
endpoint requires a GeoJSON Feature
or FeatureCollection
to define the area over which statistics will be computed.
The create_bbox_feature
function can be used to create a bounding box feature.
benchmark_statistics
is a wrapper function that runs the benchmark for the /timeseries/statistics
endpoint and returns a DataFrame with the results including the statistics computed and the time taken for each request.
Here is an example of how to use it:
gulf_geometry = create_bbox_feature(-98.676, 18.857, -81.623, 31.097)
stats_result = await benchmark_statistics(
endpoint=endpoint,
dataset=ds_xarray,
geometry=gulf_geometry,
timeout_s=300.0,
)
print("Statistics result:")
print(f" Success: {stats_result['success']}")
print(f" Elapsed: {stats_result['elapsed_s']:.2f}s")
print(f" Timesteps: {stats_result['n_timesteps']}")
=== TiTiler-CMR Statistics Benchmark === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2723754864-GES_DISC (xarray) Statistics result: Success: True Elapsed: 6.37s Timesteps: 8
You can also access the statistics output from the endpoint easily: stats_result['statistics']
print(" Statistics:")
print(json.dumps(stats_result["statistics"], indent=2))
Statistics: { "2022-03-01T00:00:01+00:00": { "2022-03-01T00:00:00.000000000": { "min": 0.0, "max": 42.82999801635742, "mean": 0.3393020033836365, "count": 20898.5, "sum": 7090.90283203125, "std": 1.9955874881629714, "median": 0.0, "majority": 0.0, "minority": 0.044999994337558746, "unique": 1347.0, "histogram": [ [ 20614, 232, 65, 46, 20, 19, 14, 16, 4, 3 ], [ 0.0, 4.2829999923706055, 8.565999984741211, 12.848999977111816, 17.131999969482422, 21.415000915527344, 25.697999954223633, 29.980998992919922, 34.263999938964844, 38.547000885009766, 42.82999801635742 ] ], "valid_percent": 100.0, "masked_pixels": 0.0, "valid_pixels": 21033.0, "percentile_2": 0.0, "percentile_98": 4.244999885559082 } }, "2022-03-08T00:00:01+00:00": { "2022-03-08T00:00:00.000000000": { "min": 0.0, "max": 42.36000061035156, "mean": 0.9508299827575684, "count": 20898.5, "sum": 19870.919921875, "std": 3.2417449281425865, "median": 0.0, "majority": 0.0, "minority": 0.03999999538064003, "unique": 2620.0, "histogram": [ [ 19621, 719, 300, 172, 89, 58, 36, 24, 10, 4 ], [ 0.0, 4.236000061035156, 8.472000122070312, 12.708000183105469, 16.944000244140625, 21.18000030517578, 25.416000366210938, 29.652000427246094, 33.88800048828125, 38.124000549316406, 42.36000061035156 ] ], "valid_percent": 100.0, "masked_pixels": 0.0, "valid_pixels": 21033.0, "percentile_2": 0.0, "percentile_98": 12.290000915527344 } }, "2022-03-15T00:00:01+00:00": { "2022-03-15T00:00:00.000000000": { "min": 0.0, "max": 174.25502014160156, "mean": 10.431671142578125, "count": 20898.5, "sum": 218006.28125, "std": 22.99134730243641, "median": 0.0, "majority": 0.0, "minority": 0.05000000447034836, "unique": 6826.0, "histogram": [ [ 17249, 1821, 640, 378, 346, 306, 171, 94, 24, 4 ], [ 0.0, 17.42550277709961, 34.85100555419922, 52.27650833129883, 69.70201110839844, 87.12751770019531, 104.55301666259766, 121.978515625, 139.40402221679688, 156.82952880859375, 174.25502014160156 ] ], "valid_percent": 100.0, "masked_pixels": 0.0, "valid_pixels": 21033.0, "percentile_2": 0.0, "percentile_98": 96.41999816894531 } }, "2022-03-22T00:00:01+00:00": { "2022-03-22T00:00:00.000000000": { "min": 0.0, "max": 113.67498779296875, "mean": 3.880465507507324, "count": 20898.5, "sum": 81095.90625, "std": 12.315773775585773, "median": 0.0, "majority": 0.0, "minority": 0.07999999076128006, "unique": 3630.0, "histogram": [ [ 19041, 713, 387, 299, 231, 178, 119, 44, 16, 5 ], [ 0.0, 11.367498397827148, 22.734996795654297, 34.10249328613281, 45.469993591308594, 56.837493896484375, 68.20498657226562, 79.5724868774414, 90.93998718261719, 102.30748748779297, 113.67498779296875 ] ], "valid_percent": 100.0, "masked_pixels": 0.0, "valid_pixels": 21033.0, "percentile_2": 0.0, "percentile_98": 53.35499954223633 } }, "2022-03-29T00:00:01+00:00": { "2022-03-29T00:00:00.000000000": { "min": 0.0, "max": 17.045001983642578, "mean": 0.1486642062664032, "count": 20898.5, "sum": 3106.85888671875, "std": 0.7623889029192605, "median": 0.0, "majority": 0.0, "minority": 0.04500000178813934, "unique": 999.0, "histogram": [ [ 20389, 377, 145, 70, 29, 10, 5, 2, 3, 3 ], [ 0.0, 1.7045001983642578, 3.4090003967285156, 5.113500595092773, 6.818000793457031, 8.522500991821289, 10.227001190185547, 11.931501388549805, 13.636001586914062, 15.34050178527832, 17.045001983642578 ] ], "valid_percent": 100.0, "masked_pixels": 0.0, "valid_pixels": 21033.0, "percentile_2": 0.0, "percentile_98": 2.5299999713897705 } }, "2022-04-05T00:00:01+00:00": { "2022-04-05T00:00:00.000000000": { "min": 0.0, "max": 40.5050048828125, "mean": 0.5256912112236023, "count": 20898.5, "sum": 10986.1572265625, "std": 2.586293636818602, "median": 0.0, "majority": 0.0, "minority": 0.03999999538064003, "unique": 1562.0, "histogram": [ [ 20283, 294, 182, 121, 61, 37, 26, 11, 11, 7 ], [ 0.0, 4.050500392913818, 8.101000785827637, 12.151500701904297, 16.202001571655273, 20.25250244140625, 24.303001403808594, 28.35350227355957, 32.40400314331055, 36.45450210571289, 40.5050048828125 ] ], "valid_percent": 100.0, "masked_pixels": 0.0, "valid_pixels": 21033.0, "percentile_2": 0.0, "percentile_98": 8.694999694824219 } }, "2022-04-12T00:00:01+00:00": { "2022-04-12T00:00:00.000000000": { "min": 0.0, "max": 64.42500305175781, "mean": 1.5356361865997314, "count": 20898.5, "sum": 32092.4921875, "std": 5.440761947003824, "median": 0.0, "majority": 0.0, "minority": 0.09499998390674591, "unique": 2873.0, "histogram": [ [ 19750, 534, 300, 192, 69, 64, 41, 56, 21, 6 ], [ 0.0, 6.442500114440918, 12.885000228881836, 19.327499389648438, 25.770000457763672, 32.212501525878906, 38.654998779296875, 45.09749984741211, 51.540000915527344, 57.98250198364258, 64.42500305175781 ] ], "valid_percent": 100.0, "masked_pixels": 0.0, "valid_pixels": 21033.0, "percentile_2": 0.0, "percentile_98": 20.049999237060547 } }, "2022-04-19T00:00:01+00:00": { "2022-04-19T00:00:00.000000000": { "min": 0.0, "max": 11.125, "mean": 0.10002636164426804, "count": 20898.5, "sum": 2090.40087890625, "std": 0.5136673785690946, "median": 0.0, "majority": 0.0, "minority": 0.06000000238418579, "unique": 807.0, "histogram": [ [ 20543, 252, 89, 74, 37, 15, 8, 10, 3, 2 ], [ 0.0, 1.1124999523162842, 2.2249999046325684, 3.3374998569488525, 4.449999809265137, 5.5625, 6.674999713897705, 7.78749942779541, 8.899999618530273, 10.012499809265137, 11.125 ] ], "valid_percent": 100.0, "masked_pixels": 0.0, "valid_pixels": 21033.0, "percentile_2": 0.0, "percentile_98": 1.274999976158142 } } }
The statistics results typically include several useful metrics for all points/intervals along a timeseries:
- min, max, mean, count, sum
- valid pixels, masked pixels, valid percentage
- percentiles (e.g., 98th percentile), data distribution histogram, unique values, median, std
RasterIO backend also supports similar statistics backend.
gulf_geometry = create_bbox_feature(-91.816, 47.491, -91.359, 47.716)
stats_result = await benchmark_statistics(
endpoint=endpoint,
dataset=ds_xarray,
geometry=gulf_geometry,
timeout_s=300.0,
)
print("Statistics result:")
print(f" Success: {stats_result['success']}")
print(f" Elapsed: {stats_result['elapsed_s']:.2f}s")
print(f" Timesteps: {stats_result['n_timesteps']}")
=== TiTiler-CMR Statistics Benchmark === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2723754864-GES_DISC (xarray) Statistics result: Success: True Elapsed: 3.54s Timesteps: 8
Now, we want to test how the size of the geometry affects performance. We’ll use square bounding boxes centered on a chosen point and increase the edge length (degrees) to see how it impacts the response time.
def bbox_square_feature(center_lon: float, center_lat: float, edge_deg: float):
"""
Build a square bbox Feature of size edge_deg × edge_deg centered at (lon, lat).
"""
half = edge_deg / 2.0
min_lon, min_lat = center_lon - half, center_lat - half
max_lon, max_lat = center_lon + half, center_lat + half
return create_bbox_feature(min_lon, min_lat, max_lon, max_lat)
center_lon, center_lat = -91.58, 47.60
edge_sizes_deg = [
20,
10,
5,
1,
0.5,
0.1,
] # caution: large areas may time out for high-res products
geom = bbox_square_feature(center_lon, center_lat, 0.1)
stats_result = await benchmark_statistics(
endpoint=endpoint,
dataset=ds_xarray,
geometry=geom,
timeout_s=300.0,
)
print("Statistics result:")
print(f" Success: {stats_result['success']}")
print(f" Elapsed: {stats_result['elapsed_s']:.2f}s")
print(f" Timesteps: {stats_result['n_timesteps']}")
print(f" Statistics: {stats_result['statistics']}")
=== TiTiler-CMR Statistics Benchmark === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2723754864-GES_DISC (xarray) Statistics result: Success: True Elapsed: 3.05s Timesteps: 8 Statistics: {'2022-03-01T00:00:01+00:00': {'2022-03-01T00:00:00.000000000': {'min': 0.0, 'max': 0.0, 'mean': 0.0, 'count': 1.1200000047683716, 'sum': 0.0, 'std': 0.0, 'median': 0.0, 'majority': 0.0, 'minority': 0.0, 'unique': 1.0, 'histogram': [[0, 0, 0, 0, 0, 4, 0, 0, 0, 0], [-0.5, -0.4000000059604645, -0.30000001192092896, -0.19999998807907104, -0.09999999403953552, 0.0, 0.10000002384185791, 0.19999998807907104, 0.30000001192092896, 0.40000003576278687, 0.5]], 'valid_percent': 100.0, 'masked_pixels': 0.0, 'valid_pixels': 4.0, 'percentile_2': 0.0, 'percentile_98': 0.0}}, '2022-03-08T00:00:01+00:00': {'2022-03-08T00:00:00.000000000': {'min': 0.0, 'max': 0.3349999785423279, 'mean': 0.11964284628629684, 'count': 1.1200000047683716, 'sum': 0.1339999884366989, 'std': 0.16051772077980764, 'median': 0.0, 'majority': 0.0, 'minority': 0.3349999785423279, 'unique': 2.0, 'histogram': [[3, 0, 0, 0, 0, 0, 0, 0, 0, 1], [0.0, 0.03349999710917473, 0.06699999421834946, 0.10049998760223389, 0.1339999884366989, 0.16749998927116394, 0.20099997520446777, 0.2344999760389328, 0.2679999768733978, 0.30149996280670166, 0.3349999785423279]], 'valid_percent': 100.0, 'masked_pixels': 0.0, 'valid_pixels': 4.0, 'percentile_2': 0.0, 'percentile_98': 0.3349999785423279}}, '2022-03-15T00:00:01+00:00': {'2022-03-15T00:00:00.000000000': {'min': 0.0, 'max': 0.03999999910593033, 'mean': 0.02049107290804386, 'count': 1.1200000047683716, 'sum': 0.022950001060962677, 'std': 0.015410452994419422, 'median': 0.014999999664723873, 'majority': 0.0, 'minority': 0.0, 'unique': 4.0, 'histogram': [[1, 1, 0, 1, 0, 0, 0, 0, 0, 1], [0.0, 0.003999999724328518, 0.007999999448657036, 0.011999999172985554, 0.01599999889731407, 0.019999999552965164, 0.023999998345971107, 0.02799999713897705, 0.03199999779462814, 0.035999998450279236, 0.03999999910593033]], 'valid_percent': 100.0, 'masked_pixels': 0.0, 'valid_pixels': 4.0, 'percentile_2': 0.0, 'percentile_98': 0.03999999910593033}}, '2022-03-22T00:00:01+00:00': {'2022-03-22T00:00:00.000000000': {'min': 12.900001525878906, 'max': 15.210000991821289, 'mean': 13.628705978393555, 'count': 1.1200000047683716, 'sum': 15.264150619506836, 'std': 0.6724586834677247, 'median': 13.289999008178711, 'majority': 12.900001525878906, 'minority': 12.900001525878906, 'unique': 4.0, 'histogram': [[1, 1, 0, 1, 0, 0, 0, 0, 0, 1], [12.900001525878906, 13.131001472473145, 13.362001419067383, 13.593001365661621, 13.82400131225586, 14.055001258850098, 14.286001205444336, 14.517001152038574, 14.748001098632812, 14.97900104522705, 15.210000991821289]], 'valid_percent': 100.0, 'masked_pixels': 0.0, 'valid_pixels': 4.0, 'percentile_2': 12.900001525878906, 'percentile_98': 15.210000991821289}}, '2022-03-29T00:00:01+00:00': {'2022-03-29T00:00:00.000000000': {'min': 0.0, 'max': 0.019999999552965164, 'mean': 0.008705356158316135, 'count': 1.1200000047683716, 'sum': 0.009749999269843102, 'std': 0.008807097580713323, 'median': 0.004999999888241291, 'majority': 0.004999999888241291, 'minority': 0.0, 'unique': 3.0, 'histogram': [[1, 0, 2, 0, 0, 0, 0, 0, 0, 1], [0.0, 0.001999999862164259, 0.003999999724328518, 0.005999999586492777, 0.007999999448657036, 0.009999999776482582, 0.011999999172985554, 0.013999998569488525, 0.01599999889731407, 0.017999999225139618, 0.019999999552965164]], 'valid_percent': 100.0, 'masked_pixels': 0.0, 'valid_pixels': 4.0, 'percentile_2': 0.0, 'percentile_98': 0.019999999552965164}}, '2022-04-05T00:00:01+00:00': {'2022-04-05T00:00:00.000000000': {'min': 0.0, 'max': 0.059999994933605194, 'mean': 0.01160714216530323, 'count': 1.1200000047683716, 'sum': 0.012999999336898327, 'std': 0.020897434586007825, 'median': 0.0, 'majority': 0.0, 'minority': 0.02499999850988388, 'unique': 3.0, 'histogram': [[2, 0, 0, 0, 1, 0, 0, 0, 0, 1], [0.0, 0.005999999586492777, 0.011999999172985554, 0.017999999225139618, 0.023999998345971107, 0.029999997466802597, 0.035999998450279236, 0.041999995708465576, 0.047999996691942215, 0.053999997675418854, 0.059999994933605194]], 'valid_percent': 100.0, 'masked_pixels': 0.0, 'valid_pixels': 4.0, 'percentile_2': 0.0, 'percentile_98': 0.059999994933605194}}, '2022-04-12T00:00:01+00:00': {'2022-04-12T00:00:00.000000000': {'min': 8.375, 'max': 11.144999504089355, 'mean': 9.746606826782227, 'count': 1.1200000047683716, 'sum': 10.916199684143066, 'std': 1.26733015130025, 'median': 8.894999504089355, 'majority': 8.375, 'minority': 8.375, 'unique': 4.0, 'histogram': [[1, 1, 0, 0, 0, 0, 0, 0, 1, 1], [8.375, 8.652000427246094, 8.928999900817871, 9.205999374389648, 9.482999801635742, 9.760000228881836, 10.036999702453613, 10.31399917602539, 10.590999603271484, 10.868000030517578, 11.144999504089355]], 'valid_percent': 100.0, 'masked_pixels': 0.0, 'valid_pixels': 4.0, 'percentile_2': 8.375, 'percentile_98': 11.144999504089355}}, '2022-04-19T00:00:01+00:00': {'2022-04-19T00:00:00.000000000': {'min': 0.0, 'max': 0.0, 'mean': 0.0, 'count': 1.1200000047683716, 'sum': 0.0, 'std': 0.0, 'median': 0.0, 'majority': 0.0, 'minority': 0.0, 'unique': 1.0, 'histogram': [[0, 0, 0, 0, 0, 4, 0, 0, 0, 0], [-0.5, -0.4000000059604645, -0.30000001192092896, -0.19999998807907104, -0.09999999403953552, 0.0, 0.10000002384185791, 0.19999998807907104, 0.30000001192092896, 0.40000003576278687, 0.5]], 'valid_percent': 100.0, 'masked_pixels': 0.0, 'valid_pixels': 4.0, 'percentile_2': 0.0, 'percentile_98': 0.0}}}
Now, let's run the benchmark for each geometry size and collect the results.
async def run_stats_benchmark_for_sizes(
ds: DatasetParams,
edge_sizes: [],
*,
endpoint: str,
center_lon: float,
center_lat: float,
timeout_s: float = 180.0,
) -> pd.DataFrame:
rows = []
for edge in edge_sizes:
geom = bbox_square_feature(center_lon, center_lat, edge)
out = await benchmark_statistics(
endpoint=endpoint,
dataset=ds,
geometry=geom,
timeout_s=timeout_s,
)
rows.append(
{
"backend": ds.backend,
"concept_id": ds.concept_id,
"edge_deg": edge,
"success": out.get("success", False),
"status_code": out.get("status_code", None),
"elapsed_s": out.get("elapsed_s", None),
"n_timesteps": out.get("n_timesteps", 0),
}
)
return pd.DataFrame(rows)
df_xr = await run_stats_benchmark_for_sizes(
ds_xarray,
edge_sizes_deg,
endpoint=endpoint,
center_lon=center_lon,
center_lat=center_lat,
timeout_s=180.0,
)
df_xr
=== TiTiler-CMR Statistics Benchmark === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2723754864-GES_DISC (xarray) === TiTiler-CMR Statistics Benchmark === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2723754864-GES_DISC (xarray) === TiTiler-CMR Statistics Benchmark === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2723754864-GES_DISC (xarray) === TiTiler-CMR Statistics Benchmark === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2723754864-GES_DISC (xarray) === TiTiler-CMR Statistics Benchmark === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2723754864-GES_DISC (xarray) === TiTiler-CMR Statistics Benchmark === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2723754864-GES_DISC (xarray)
backend | concept_id | edge_deg | success | status_code | elapsed_s | n_timesteps | |
---|---|---|---|---|---|---|---|
0 | xarray | C2723754864-GES_DISC | 20.0 | True | 200 | 3.333515 | 8 |
1 | xarray | C2723754864-GES_DISC | 10.0 | True | 200 | 5.793644 | 8 |
2 | xarray | C2723754864-GES_DISC | 5.0 | True | 200 | 5.391589 | 8 |
3 | xarray | C2723754864-GES_DISC | 1.0 | True | 200 | 2.780133 | 8 |
4 | xarray | C2723754864-GES_DISC | 0.5 | True | 200 | 1.837056 | 8 |
5 | xarray | C2723754864-GES_DISC | 0.1 | True | 200 | 1.923999 | 8 |
Tip: If you see timeouts or failures at larger sizes, try smaller AOIs first and/or choose a coarser step.
import matplotlib.pyplot as plt
def plot_elapsed_vs_size(
df: pd.DataFrame, *, title: str = "Elapsed time vs geometry size (xarray)"
):
sdf = df.copy()
sdf = sdf[pd.notnull(sdf["elapsed_s"])].sort_values("edge_deg")
fig, ax = plt.subplots(figsize=(8, 5))
ax.plot(sdf["edge_deg"], sdf["elapsed_s"], marker="o")
ax.set_xlabel("Square edge size (degrees)")
ax.set_ylabel("Elapsed time (s)")
ax.set_title(title)
ax.grid(True, axis="y", alpha=0.2)
plt.tight_layout()
return fig, ax
_ = plot_elapsed_vs_size(df_xr)
plt.show()
Since our data is relatively low resolution, we can easily load the bigger files into memory without any timeouts. Let's try our RasterIO data with 5 degree bounding box.
geom = bbox_square_feature(center_lon, center_lat, 5)
stats_result = await benchmark_statistics(
endpoint=endpoint,
dataset=ds_rasterio,
geometry=geom,
timeout_s=300.0,
)
print("Statistics result:")
print(f" Success: {stats_result['success']}")
print(f" Elapsed: {stats_result['elapsed_s']:.2f}s")
print(f" Timesteps: {stats_result['n_timesteps']}")
print(f" Statistics: {stats_result['statistics']}")
=== TiTiler-CMR Statistics Benchmark === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2021957657-LPCLOUD (rasterio) ~~~~~~~~~~~~~~~~ ERROR JSON REQUEST ~~~~~~~~~~~~~~~~ URL: https://staging.openveda.cloud/api/titiler-cmr/timeseries/statistics?concept_id=C2021957657-LPCLOUD&backend=rasterio&datetime=2022-03-01T00%3A00%3A01Z%2F2022-03-01T23%3A59%3A59Z&bands=%5B%27B04%27%2C+%27B03%27%2C+%27B02%27%5D&bands_regex=B%5B0-9%5D%5B0-9%5D&step=P1D&temporal_mode=point Error: 400 Bad Request Body: {"detail":"The AOI for this request is too large for the /statistics endpoint for this dataset. Try again with either a smaller AOI"} Statistics result: Success: False
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[55], line 10 8 print("Statistics result:") 9 print(f" Success: {stats_result['success']}") ---> 10 print(f" Elapsed: {stats_result['elapsed_s']:.2f}s") 11 print(f" Timesteps: {stats_result['n_timesteps']}") 12 print(f" Statistics: {stats_result['statistics']}") TypeError: unsupported format string passed to NoneType.__format__
Time Range¶
For statistics benchmarking, the number of timesteps matters too. Longer time series (more timesteps) will generally take longer to process. This sweep varies the time window length (number of timesteps) while keeping the geometry size constant to see how that affects performance.
The time series API supports the following parameters:
- datetime (str): Either a date-time, an interval, or a comma-separated list of date-times or intervals. Date and time expressions adhere to rfc3339 ('2020-06-01T09:00:00Z') format.
- step (str): width of individual time steps expressed as a IS8601 duration
- temporal_mode (str): if "point", queries will be made for the individual timestamps along the timeseries. If "interval", queries will be made for the periods between each timestamp along the timeseries.
##daily 1 days
ds_xarray = DatasetParams(
concept_id="C2723754864-GES_DISC",
backend="xarray",
datetime_range="2022-03-01T00:00:01Z/2022-03-02T23:59:59Z",
variable="precipitation",
step="P1D",
temporal_mode="point",
)
gulf_geometry = create_bbox_feature(-98.676, 18.857, -81.623, 31.097)
stats_result = await benchmark_statistics(
endpoint=endpoint,
dataset=ds_xarray,
geometry=gulf_geometry,
timeout_s=300.0,
)
print("Statistics result:")
print(f" Success: {stats_result['success']}")
print(f" Elapsed: {stats_result['elapsed_s']:.2f}s")
print(f" Timesteps: {stats_result['n_timesteps']}")
print(f" Statistics: {stats_result['statistics']}")
=== TiTiler-CMR Statistics Benchmark === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2723754864-GES_DISC (xarray) Statistics result: Success: True Elapsed: 3.36s Timesteps: 2 Statistics: {'2022-03-01T00:00:01+00:00': {'2022-03-01T00:00:00.000000000': {'min': 0.0, 'max': 42.82999801635742, 'mean': 0.3393020033836365, 'count': 20898.5, 'sum': 7090.90283203125, 'std': 1.9955874881629714, 'median': 0.0, 'majority': 0.0, 'minority': 0.044999994337558746, 'unique': 1347.0, 'histogram': [[20614, 232, 65, 46, 20, 19, 14, 16, 4, 3], [0.0, 4.2829999923706055, 8.565999984741211, 12.848999977111816, 17.131999969482422, 21.415000915527344, 25.697999954223633, 29.980998992919922, 34.263999938964844, 38.547000885009766, 42.82999801635742]], 'valid_percent': 100.0, 'masked_pixels': 0.0, 'valid_pixels': 21033.0, 'percentile_2': 0.0, 'percentile_98': 4.244999885559082}}, '2022-03-02T00:00:01+00:00': {'2022-03-02T00:00:00.000000000': {'min': 0.0, 'max': 18.755001068115234, 'mean': 0.04489568620920181, 'count': 20898.5, 'sum': 938.2525024414062, 'std': 0.42384278584099, 'median': 0.0, 'majority': 0.0, 'minority': 0.08000000566244125, 'unique': 486.0, 'histogram': [[20900, 77, 27, 13, 10, 2, 1, 1, 1, 1], [0.0, 1.8755000829696655, 3.751000165939331, 5.626500129699707, 7.502000331878662, 9.377500534057617, 11.253000259399414, 13.128500938415527, 15.004000663757324, 16.879501342773438, 18.755001068115234]], 'valid_percent': 100.0, 'masked_pixels': 0.0, 'valid_pixels': 21033.0, 'percentile_2': 0.0, 'percentile_98': 0.42500001192092896}}}
##weekly 50 days
ds_xarray = DatasetParams(
concept_id="C2723754864-GES_DISC",
backend="xarray",
datetime_range="2022-03-01T00:00:01Z/2022-04-20T23:59:59Z",
variable="precipitation",
step="P1W",
temporal_mode="point",
)
gulf_geometry = create_bbox_feature(-98.676, 18.857, -81.623, 31.097)
stats_result = await benchmark_statistics(
endpoint=endpoint,
dataset=ds_xarray,
geometry=gulf_geometry,
timeout_s=300.0,
)
print("Statistics result:")
print(f" Success: {stats_result['success']}")
print(f" Elapsed: {stats_result['elapsed_s']:.2f}s")
print(f" Timesteps: {stats_result['n_timesteps']}")
print(f" Statistics: {stats_result['statistics']}")
=== TiTiler-CMR Statistics Benchmark === Client: 2 physical / 4 logical cores | RAM: 30.89 GiB Dataset: C2723754864-GES_DISC (xarray) Statistics result: Success: True Elapsed: 7.28s Timesteps: 8 Statistics: {'2022-03-01T00:00:01+00:00': {'2022-03-01T00:00:00.000000000': {'min': 0.0, 'max': 42.82999801635742, 'mean': 0.3393020033836365, 'count': 20898.5, 'sum': 7090.90283203125, 'std': 1.9955874881629714, 'median': 0.0, 'majority': 0.0, 'minority': 0.044999994337558746, 'unique': 1347.0, 'histogram': [[20614, 232, 65, 46, 20, 19, 14, 16, 4, 3], [0.0, 4.2829999923706055, 8.565999984741211, 12.848999977111816, 17.131999969482422, 21.415000915527344, 25.697999954223633, 29.980998992919922, 34.263999938964844, 38.547000885009766, 42.82999801635742]], 'valid_percent': 100.0, 'masked_pixels': 0.0, 'valid_pixels': 21033.0, 'percentile_2': 0.0, 'percentile_98': 4.244999885559082}}, '2022-03-08T00:00:01+00:00': {'2022-03-08T00:00:00.000000000': {'min': 0.0, 'max': 42.36000061035156, 'mean': 0.9508299827575684, 'count': 20898.5, 'sum': 19870.919921875, 'std': 3.2417449281425865, 'median': 0.0, 'majority': 0.0, 'minority': 0.03999999538064003, 'unique': 2620.0, 'histogram': [[19621, 719, 300, 172, 89, 58, 36, 24, 10, 4], [0.0, 4.236000061035156, 8.472000122070312, 12.708000183105469, 16.944000244140625, 21.18000030517578, 25.416000366210938, 29.652000427246094, 33.88800048828125, 38.124000549316406, 42.36000061035156]], 'valid_percent': 100.0, 'masked_pixels': 0.0, 'valid_pixels': 21033.0, 'percentile_2': 0.0, 'percentile_98': 12.290000915527344}}, '2022-03-15T00:00:01+00:00': {'2022-03-15T00:00:00.000000000': {'min': 0.0, 'max': 174.25502014160156, 'mean': 10.431671142578125, 'count': 20898.5, 'sum': 218006.28125, 'std': 22.99134730243641, 'median': 0.0, 'majority': 0.0, 'minority': 0.05000000447034836, 'unique': 6826.0, 'histogram': [[17249, 1821, 640, 378, 346, 306, 171, 94, 24, 4], [0.0, 17.42550277709961, 34.85100555419922, 52.27650833129883, 69.70201110839844, 87.12751770019531, 104.55301666259766, 121.978515625, 139.40402221679688, 156.82952880859375, 174.25502014160156]], 'valid_percent': 100.0, 'masked_pixels': 0.0, 'valid_pixels': 21033.0, 'percentile_2': 0.0, 'percentile_98': 96.41999816894531}}, '2022-03-22T00:00:01+00:00': {'2022-03-22T00:00:00.000000000': {'min': 0.0, 'max': 113.67498779296875, 'mean': 3.880465507507324, 'count': 20898.5, 'sum': 81095.90625, 'std': 12.315773775585773, 'median': 0.0, 'majority': 0.0, 'minority': 0.07999999076128006, 'unique': 3630.0, 'histogram': [[19041, 713, 387, 299, 231, 178, 119, 44, 16, 5], [0.0, 11.367498397827148, 22.734996795654297, 34.10249328613281, 45.469993591308594, 56.837493896484375, 68.20498657226562, 79.5724868774414, 90.93998718261719, 102.30748748779297, 113.67498779296875]], 'valid_percent': 100.0, 'masked_pixels': 0.0, 'valid_pixels': 21033.0, 'percentile_2': 0.0, 'percentile_98': 53.35499954223633}}, '2022-03-29T00:00:01+00:00': {'2022-03-29T00:00:00.000000000': {'min': 0.0, 'max': 17.045001983642578, 'mean': 0.1486642062664032, 'count': 20898.5, 'sum': 3106.85888671875, 'std': 0.7623889029192605, 'median': 0.0, 'majority': 0.0, 'minority': 0.04500000178813934, 'unique': 999.0, 'histogram': [[20389, 377, 145, 70, 29, 10, 5, 2, 3, 3], [0.0, 1.7045001983642578, 3.4090003967285156, 5.113500595092773, 6.818000793457031, 8.522500991821289, 10.227001190185547, 11.931501388549805, 13.636001586914062, 15.34050178527832, 17.045001983642578]], 'valid_percent': 100.0, 'masked_pixels': 0.0, 'valid_pixels': 21033.0, 'percentile_2': 0.0, 'percentile_98': 2.5299999713897705}}, '2022-04-05T00:00:01+00:00': {'2022-04-05T00:00:00.000000000': {'min': 0.0, 'max': 40.5050048828125, 'mean': 0.5256912112236023, 'count': 20898.5, 'sum': 10986.1572265625, 'std': 2.586293636818602, 'median': 0.0, 'majority': 0.0, 'minority': 0.03999999538064003, 'unique': 1562.0, 'histogram': [[20283, 294, 182, 121, 61, 37, 26, 11, 11, 7], [0.0, 4.050500392913818, 8.101000785827637, 12.151500701904297, 16.202001571655273, 20.25250244140625, 24.303001403808594, 28.35350227355957, 32.40400314331055, 36.45450210571289, 40.5050048828125]], 'valid_percent': 100.0, 'masked_pixels': 0.0, 'valid_pixels': 21033.0, 'percentile_2': 0.0, 'percentile_98': 8.694999694824219}}, '2022-04-12T00:00:01+00:00': {'2022-04-12T00:00:00.000000000': {'min': 0.0, 'max': 64.42500305175781, 'mean': 1.5356361865997314, 'count': 20898.5, 'sum': 32092.4921875, 'std': 5.440761947003824, 'median': 0.0, 'majority': 0.0, 'minority': 0.09499998390674591, 'unique': 2873.0, 'histogram': [[19750, 534, 300, 192, 69, 64, 41, 56, 21, 6], [0.0, 6.442500114440918, 12.885000228881836, 19.327499389648438, 25.770000457763672, 32.212501525878906, 38.654998779296875, 45.09749984741211, 51.540000915527344, 57.98250198364258, 64.42500305175781]], 'valid_percent': 100.0, 'masked_pixels': 0.0, 'valid_pixels': 21033.0, 'percentile_2': 0.0, 'percentile_98': 20.049999237060547}}, '2022-04-19T00:00:01+00:00': {'2022-04-19T00:00:00.000000000': {'min': 0.0, 'max': 11.125, 'mean': 0.10002636164426804, 'count': 20898.5, 'sum': 2090.40087890625, 'std': 0.5136673785690946, 'median': 0.0, 'majority': 0.0, 'minority': 0.06000000238418579, 'unique': 807.0, 'histogram': [[20543, 252, 89, 74, 37, 15, 8, 10, 3, 2], [0.0, 1.1124999523162842, 2.2249999046325684, 3.3374998569488525, 4.449999809265137, 5.5625, 6.674999713897705, 7.78749942779541, 8.899999618530273, 10.012499809265137, 11.125]], 'valid_percent': 100.0, 'masked_pixels': 0.0, 'valid_pixels': 21033.0, 'percentile_2': 0.0, 'percentile_98': 1.274999976158142}}}
Here is an example plot that can be created easily using the /timeseries/statistics
endpoint:
import numpy as np
from datetime import datetime
data = stats_result["statistics"]
dates = []
means = []
stds = []
for date_str, values in data.items():
dates.append(datetime.fromisoformat(date_str))
inner_data = list(values.values())[0]
means.append(inner_data["mean"])
stds.append(inner_data["std"])
plt.figure(figsize=(12, 6))
plt.plot(dates, means, linestyle="-", marker="o", linewidth=2, label="Mean")
plt.fill_between(
dates,
np.array(means) - np.array(stds),
np.array(means) + np.array(stds),
alpha=0.2,
color="b",
label="±1 Standard Deviation",
)
plt.xlabel("Date")
plt.ylabel("Precipitation (mm)") # Updated based on your data
plt.title("Precipitation Statistics Over Time")
plt.legend()
plt.xticks(rotation=45)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# Print some summary statistics
print(f"Date range: {min(dates)} to {max(dates)}")
print(f"Mean precipitation range: {min(means):.2f} to {max(means):.2f}")
print(f"Average standard deviation: {np.mean(stds):.2f}")
Date range: 2022-03-01 00:00:01+00:00 to 2022-04-19 00:00:01+00:00 Mean precipitation range: 0.10 to 10.43 Average standard deviation: 6.23
Conclusion¶
In this notebook, we explored how to benchmark the /timeseries/statistics
endpoint of a TiTiler-CMR deployment. We examined how different parameters, such as geometry size and time range, impact the performance of this endpoint.
In general, for high-resolution datasets with many granules, it's advisable to use smaller AOIs and shorter time ranges to ensure timely responses and avoid timeouts.