Architecture¶
Principles¶
- A deployable component, not a CI job. The benchmark runs on real infrastructure against real data. CI lints, unit-tests the harness, builds the image, and proves the stack deploys — it never runs a benchmark.
- Datasets and runs are configuration, not code. Adding a dataset or a target format is a new config file plus a registration — never a change to CI or the deployment manifests.
- The harness is importable and service-free. All live dependencies (TiTiler, object storage) belong to the deployment. The metric logic is unit tested in isolation; the geo/S3 libraries are optional extras.
The two layers¶
flowchart TB
subgraph Harness["Harness (cng_benchmark, container image)"]
CLI["cli — run / seed / version"]
Runner["runner — run_conversion_benchmark"]
subgraph Metrics["metrics/"]
OBJ["objects — size + tier fit"]
WR["write — conversion throughput"]
RD["read — /vsis3 windowed reads"]
DI["display — TiTiler tiles"]
end
Formats["formats/ — FormatAdapter (cog, …)"]
Storage["storage — per-role S3 + gdal_env"]
Report["report — result.json + summary.md"]
Config["config — pydantic schema"]
Registry["registry — FORMATS / DATASETS"]
end
subgraph Deploy["Deployment (compose / Helm)"]
Store[("S3 / MinIO")]
TiTiler["TiTiler service"]
end
Config --> Runner
Registry --> Runner
Runner --> Formats
Runner --> Metrics
Metrics --> Storage
Metrics --> Report
Storage <--> Store
DI <--> TiTiler
TiTiler <--> Store
- Harness — the Python package, shipped as the runner image
(
docker/Dockerfile.runner). The CLI is a thin shell over the runner;FormatAdapteris the plug-in seam for formats;config+registryare the data-not-code seam. - Deployment — the runner plus its service dependencies, wired by docker-compose (local) and the Helm chart (Kubernetes). See Deployment.
A benchmark run (COG end-to-end)¶
run_conversion_benchmark orchestrates one run. Which metrics execute is driven
by config.metrics; the conversion always happens (the other metrics need the
produced object).
sequenceDiagram
participant R as Runner
participant Src as Source store
participant Conv as Format adapter
participant Sink as Sink store
participant T as TiTiler
R->>Src: read source in place (GDAL /vsis3)
R->>Conv: convert → COG (write metric times this)
R->>Sink: publish produced COG
R->>R: object-size profile + tier fit
R->>Sink: windowed range reads (read metric)
R->>T: request tiles (display metric)
T->>Sink: read COG (GDAL /vsis3)
R->>Sink: write result.json + summary.md
The source is read in place over the network rather than pre-downloaded, so the cost of reading out of the (often archived) source is part of the measured conversion — not laundered away by staging it to fast local disk first.
Result schema¶
A run produces a BenchmarkRun (models.py): the run context (timestamp, tool
versions, dataset/format/params), the first-class ObjectSizeProfile
(distribution percentiles, histogram, and tier fitness from tiers.py), and a
list of MetricResult scalars. It is serialised to result.json and rendered
to a human summary.md by report.py.
Storage: one provider, or two (source ≠ sink)¶
S3 settings are resolved per role (storage.s3_profile):
- sink — reads the bare
AWS_*environment (results, and the produced COG that TiTiler serves). - source — reads
SOURCE_AWS_*and falls back to the bareAWS_*.
So a single-provider run (the synthetic path: source and sink both the same
store) needs no SOURCE_* and behaves exactly as before, while a real run can
read its source from one provider and write its sink to another — each with its
own endpoint, credentials, and CA bundle — in the same process.
flowchart LR
subgraph Run["one run"]
Runner["runner"]
end
Datalake[("source<br/>private-CA, read-only")]
Scaleway[("sink<br/>read-write")]
TiTiler["TiTiler"]
Runner -- "SOURCE_AWS_* · gdal_session(source)" --> Datalake
Runner -- "AWS_* · gdal_session(sink)" --> Scaleway
TiTiler --> Scaleway
GDAL's /vsis3 configuration is process-global, so per-role config is scoped
with a rasterio.Env context manager (gdal_env.gdal_session) rather than one
static environment. Only the runner spans two providers — TiTiler only ever
reads the sink.
Plug-in seams¶
- Formats — a
FormatAdapter(formats/base.py) converts a baseline to a target and enumerates the produced objects; it is registered by name inFORMATS(registry.py). The runner resolves the adapter named in the config. Adding a format is a new registered subclass. - Datasets / benchmarks — described in
configs/and validated by theconfig.pypydantic schema. Adding one is a new YAML file, picked up by the deployment's ConfigMap (Helm) or a mounted file (compose). See Configuration.
Status & roadmap¶
| Milestone | State |
|---|---|
| M0–M1 — harness skeleton, config schema, object-size metric, tiers | done |
| M2 — deployable stack (compose + Helm), deployability CI, COG end-to-end (write/read/display), two-provider storage | done |
| Real mission — S1/S2 L2A → COG on the lab cluster | gated on external access (CA bundle + egress), then a deploy-time activity |
| M3 — second dataset/format through config only (e.g. SWOT Lake → GeoParquet) + "add a dataset / new target" docs | next |