Titiler makes use of several great underlying libraries, including GDAL and Python bindings to GDAL. An effective deployment of titiler generally requires tweaking GDAL configuration settings. This document provides an overview of relevant settings. Full documentation from GDAL is available here.
Setting a config variable¶
GDAL configuration is modified using environment variables. Thus in order to change a setting you'll need to set environment variables through your deployment mechanism. For example, in order to test locally you'd set an environment variable in bash:
Available configuration settings¶
When set to
YES, this tells GDAL to merge adjacent range requests. Instead of
making two requests for byte ranges
6-10, it would make a single
1-10. This should always be set to
This is a very important setting to control the number of requests GDAL makes.
This setting has two options:
FALSE (the default)
causes GDAL to try to establish a list of all the available files in the
EMPTY_DIR tells GDAL to imagine that the directory is empty except
for the requested file.
When reading datasets with necessary external sidecar files, it's imperative to
FALSE. For example, the
landsat-pds bucket on AWS S3 contains GeoTIFF
images where overviews are in external
.ovr files. If set to
won't find the
However, in all other cases, it's much better to set
EMPTY_DIR because this
prevents GDAL from making a
This setting also has cost implications for reading data from requester-pays
buckets. When set to
FALSE, GDAL makes a
LIST request every time it opens a
LIST requests are much more expensive than
GET requests, this
can bring unexpected costs.
A list of file extensions that GDAL is allowed to open. For example if set to
.tif, then GDAL would only open files with a
.tif extension. For example, it
would fail on JPEG2000 files with a
.jp2 extension, but also wouldn't open
GeoTIFFs exposed through an API endpoint that don't have a
Note that you also need to include extensions of external overview files. For
landsat-pds bucket on AWS S3 has external overviews in
files, so if you wished to read this data, you'd want
Gives the number of initial bytes GDAL should read when opening a file and inspecting its metadata.
Titiler works best with Cloud-Optimized GeoTIFFs (COGs) because they have a tiled internal structure that supports efficient random reads. These files have an initial metadata section that describes the location (byte range) within the file of each internal tile. The more internal tiles the COG has, the more data the header needs to contain.
GDAL needs to read the entire header before it can read any other portion of the file. By default GDAL reads the first 16KB of the file, then if that doesn't contain the entire metadata, it makes one more request for the rest of the metadata.
In environments where latency is relatively high (at least compared to bandwidth), such as AWS S3, it may be beneficial to increase this value depending on the data you expect to read.
There isn't currently a way to get the number of header bytes using GDAL, but
alternative GeoTIFF readers such as
aiocogeo can. Using its cli
you can find the image's header size:
export AWS_REQUEST_PAYER="requester" aiocogeo info s3://usgs-landsat/collection02/level-2/standard/oli-tirs/2020/072/076/LC08_L2SR_072076_20201203_20210313_02_T2/LC08_L2SR_072076_20201203_20210313_02_T2_SR_B1.TIF PROFILE ... Header size: 32770
It's wise to inspect the header sizes of your data sources, and set
GDAL_INGESTED_BYTES_AT_OPEN appropriately. Beware, however, that the given
number of bytes will be read for every image, so you don't want to make the
value too large.
Default GDAL block cache. The value can be either in Mb, bytes or percent of the physical RAM
Recommended: 200 (200Mb)
A global least-recently-used cache shared among all downloaded content and may be reused after a file handle has been closed and reopen
Recommended: 200000000 (200Mb)
Setting this to
TRUE enables GDAL to use an internal caching mechanism. It's
Recommended (Strongly): TRUE.
The size of the above VSI cache in bytes per-file handle. If you open a VRT with 10 files and your VSI_CACHE_SIZE is 10 bytes, the total cache memory usage would be 100 bytes. The cache is RAM based and the content of the cache is discarded when the file handle is closed.
Recommended: 5000000 (5Mb per file handle)
GDAL Block Cache type:
HASHSET. See gdal.org/development/rfc/rfc26_blockcache.html
Introduced with GDAL 3 and PROJ>7, the PROJ library can fetch more precise transformation grids hosted on the cloud.
When set to
YES, this attempts to download multiple range requests in
parallel, reusing the same TCP connection. Note this is only possible when the
server supports HTTP2, which many servers don't yet support. There's no
downside to setting
GDAL_DATA variable tells rasterio/GDAL where the GDAL C libraries have been installed. When using rasterio wheels, GDAL_DATA must be unset.
PROJ_LIB variable tells rasterio/GDAL where the PROJ C libraries have been installed. When using rasterio wheels, PROJ_LIB must be unset.
Recommended Configuration for dynamic tiling¶
In addition to
GDAL_DISABLE_READDIR_ON_OPEN, we set the allowed extensions to
.tif to only enable tif files. (OPTIONAL)
200 Mb Cache.
200 Mb VSI Cache.
Maybe the most important variable. Setting it to
EMPTY_DIR reduce the number of GET/LIST requests.
Tells GDAL to merge consecutive range GET requests.
Both Multiplex and HTTP_VERSION will only have impact if the files are stored in an environment which support HTTP 2 (e.g cloudfront).
5Mb cache per file handle.