xarray backend: MUR SST¶

The MUR SST dataset has daily records for sea surface temperature and ice cover fraction. There is a netcdf file for each record.

To run the titiler-cmr service locally you can fire up the docker network with this command:

docker compose up

Requirements¶

To run some of the chunks in this notebook you will need to install a few packages: earthaccess, folium, httpx, xarray

In [1]:

Copied!





import json
from datetime import datetime, timezone

import earthaccess
import httpx
from folium import Map, TileLayer

# titiler_endpoint = "http://localhost:8081"  # docker network endpoint
# titiler_endpoint = (
#     "https://staging.openveda.cloud/api/titiler-cmr"  # VEDA staging endpoint
# )
titiler_endpoint = (
    "https://v4jec6i5c0.execute-api.us-west-2.amazonaws.com"  # dev endpoint
)
import json
from datetime import datetime, timezone

import earthaccess
import httpx
from folium import Map, TileLayer

# titiler_endpoint = "http://localhost:8081"  # docker network endpoint
# titiler_endpoint = (
#     "https://staging.openveda.cloud/api/titiler-cmr"  # VEDA staging endpoint
# )
titiler_endpoint = (
    "https://v4jec6i5c0.execute-api.us-west-2.amazonaws.com"  # dev endpoint
)

Identify the dataset¶

You can find the MUR SST dataset using the earthaccess.search_datasets function.

In [2]:

Copied!

datasets = earthaccess.search_datasets(doi="10.5067/GHGMR-4FJ04")
ds = datasets[0]

concept_id = ds["meta"]["concept-id"]
print("Concept-Id: ", concept_id)

print("Abstract: ", ds["umm"]["Abstract"])
datasets = earthaccess.search_datasets(doi="10.5067/GHGMR-4FJ04")
ds = datasets[0]

concept_id = ds["meta"]["concept-id"]
print("Concept-Id: ", concept_id)

print("Abstract: ", ds["umm"]["Abstract"])

Concept-Id:  C1996881146-POCLOUD
Abstract:  A Group for High Resolution Sea Surface Temperature (GHRSST) Level 4 sea surface temperature analysis produced as a retrospective dataset (four day latency) and near-real-time dataset (one day latency) at the JPL Physical Oceanography DAAC using wavelets as basis functions in an optimal interpolation approach on a global 0.01 degree grid. The version 4 Multiscale Ultrahigh Resolution (MUR) L4 analysis is based upon nighttime GHRSST L2P skin and subskin SST observations from several instruments including the NASA Advanced Microwave Scanning Radiometer-EOS (AMSR-E), the JAXA Advanced Microwave Scanning Radiometer 2 on GCOM-W1, the Moderate Resolution Imaging Spectroradiometers (MODIS) on the NASA Aqua and Terra platforms, the US Navy microwave WindSat radiometer, the Advanced Very High Resolution Radiometer (AVHRR) on several NOAA satellites, and in situ SST observations from the NOAA iQuam project. The ice concentration data are from the archives at the EUMETSAT Ocean and Sea Ice Satellite Application Facility (OSI SAF) High Latitude Processing Center and are also used for an improved SST parameterization for the high-latitudes.  The dataset also contains additional variables for some granules including a SST anomaly derived from a MUR climatology and the temporal distance to the nearest IR measurement for each pixel.This dataset is funded by the NASA MEaSUREs program ( http://earthdata.nasa.gov/our-community/community-data-system-programs/measures-projects ), and created by a team led by Dr. Toshio M. Chin from JPL. It adheres to the GHRSST Data Processing Specification (GDS) version 2 format specifications. Use the file global metadata "history:" attribute to determine if a granule is near-realtime or retrospective.

Explore the collection using the `/compatibility` endpoint¶

The /compatibility endpoint will display information about the collection and return some details about a sample granule. The output is helpful for understanding the structure of the collection and the granules so that you can craft the right set of parameters for visualization or statistics requests.

In [3]:

Copied!





compatibility_response = httpx.get(
    f"{titiler_endpoint}/compatibility",
    params={"concept_id": concept_id},
    timeout=None,
).json()

print(json.dumps(compatibility_response, indent=2))
compatibility_response = httpx.get(
    f"{titiler_endpoint}/compatibility",
    params={"concept_id": concept_id},
    timeout=None,
).json()

print(json.dumps(compatibility_response, indent=2))

{
  "concept_id": "C1996881146-POCLOUD",
  "backend": "xarray",
  "datetime": [
    {
      "RangeDateTimes": [
        {
          "BeginningDateTime": "2002-05-31T21:00:00.000Z"
        }
      ]
    }
  ],
  "variables": {
    "analysed_sst": {
      "shape": [
        1,
        17999,
        36000
      ],
      "dtype": "float64",
      "min": 271.34999999999997,
      "max": 305.929,
      "mean": 287.46900962686567,
      "p01": 271.34999999999997,
      "p05": 271.34999999999997,
      "p95": 302.62,
      "p99": 303.52200999999997
    },
    "analysis_error": {
      "shape": [
        1,
        17999,
        36000
      ],
      "dtype": "float64",
      "min": 0.34,
      "max": 0.42,
      "mean": 0.37725259948401224,
      "p01": 0.34,
      "p05": 0.35000000000000003,
      "p95": 0.4,
      "p99": 0.41000000000000003
    },
    "mask": {
      "shape": [
        1,
        17999,
        36000
      ],
      "dtype": "float32",
      "min": 1.0,
      "max": 13.0,
      "mean": 2.695591688156128,
      "p01": 1.0,
      "p05": 1.0,
      "p95": 9.0,
      "p99": 9.0
    },
    "sea_ice_fraction": {
      "shape": [
        1,
        17999,
        36000
      ],
      "dtype": "float64",
      "min": 0.0,
      "max": 1.0,
      "mean": 0.4045281873683459,
      "p01": 0.0,
      "p05": 0.0,
      "p95": 1.0,
      "p99": 1.0
    }
  },
  "dimensions": {
    "time": 1,
    "lat": 17999,
    "lon": 36000
  },
  "coordinates": {
    "time": {
      "size": 1,
      "dtype": "datetime64[ns]"
    },
    "lat": {
      "size": 17999,
      "dtype": "float32",
      "min": -89.98999786376953,
      "max": 89.98999786376953
    },
    "lon": {
      "size": 36000,
      "dtype": "float32",
      "min": -179.99000549316406,
      "max": 180.0
    }
  },
  "example_assets": "s3://podaac-ops-cumulus-protected/MUR-JPL-L4-GLOB-v4.1/20020601090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc"
}

The details from the sample granule show that it is a NetCDF file with four variables (analysed_sst, analysis_error, mask, and sea_ice_fraction) and each contains an array with a single time coordinate. The datetime key shows the reported temporal range from CMR which indicates that the dataset has granules from 2002-05-31 to present. For each variable several summary statistics are available to help you craft min/max values for the rescale parameter.

Define a query for titiler-cmr¶

To use titiler-cmr's endpoints for a NetCDF dataset like this we need to define a date range for the CMR query and a variable to analyze.

In [4]:

Copied!

variable = "sea_ice_fraction"
datetime_ = datetime(2024, 10, 10, tzinfo=timezone.utc).isoformat()
variable = "sea_ice_fraction"
datetime_ = datetime(2024, 10, 10, tzinfo=timezone.utc).isoformat()

Display tiles in an interactive map¶

The /tilejson.json endpoint will provide a parameterized xyz tile URL that can be added to an interactive map.

In [5]:

Copied!





r = httpx.get(
    f"{titiler_endpoint}/WebMercatorQuad/tilejson.json",
    params=(
        ("concept_id", concept_id),
        # Datetime in form of `start_date/end_date`
        ("datetime", datetime_),
        # titiler-cmr can work with both Zarr and COG dataset
        # but we need to tell the endpoints in advance which backend
        # to use
        ("backend", "xarray"),
        ("variable", variable),
        # We need to set min/max zoom because we don't want to use lowerzoom level (e.g 0)
        # which will results in useless large scale query
        ("minzoom", 2),
        ("maxzoom", 13),
        ("rescale", "0,1"),
        ("colormap_name", "blues_r"),
    ),
    timeout=None,
).json()

print(r)
r = httpx.get(
    f"{titiler_endpoint}/WebMercatorQuad/tilejson.json",
    params=(
        ("concept_id", concept_id),
        # Datetime in form of `start_date/end_date`
        ("datetime", datetime_),
        # titiler-cmr can work with both Zarr and COG dataset
        # but we need to tell the endpoints in advance which backend
        # to use
        ("backend", "xarray"),
        ("variable", variable),
        # We need to set min/max zoom because we don't want to use lowerzoom level (e.g 0)
        # which will results in useless large scale query
        ("minzoom", 2),
        ("maxzoom", 13),
        ("rescale", "0,1"),
        ("colormap_name", "blues_r"),
    ),
    timeout=None,
).json()

print(r)

{'tilejson': '2.2.0', 'version': '1.0.0', 'scheme': 'xyz', 'tiles': ['https://v4jec6i5c0.execute-api.us-west-2.amazonaws.com/tiles/WebMercatorQuad/{z}/{x}/{y}@1x?concept_id=C1996881146-POCLOUD&datetime=2024-10-10T00%3A00%3A00%2B00%3A00&backend=xarray&variable=sea_ice_fraction&rescale=0%2C1&colormap_name=blues_r'], 'minzoom': 2, 'maxzoom': 13, 'bounds': [-180.0, -90.0, 180.0, 90.0], 'center': [0.0, 0.0, 2]}

In [6]:

Copied!





bounds = r["bounds"]
m = Map(location=(70, -40), zoom_start=3)

TileLayer(
    tiles=r["tiles"][0],
    opacity=1,
    attr="NASA",
).add_to(m)
m
bounds = r["bounds"]
m = Map(location=(70, -40), zoom_start=3)

TileLayer(
    tiles=r["tiles"][0],
    opacity=1,
    attr="NASA",
).add_to(m)
m

Out[6]:

Make this Notebook Trusted to load map: File -> Trust Notebook

GeoJSON Statistics¶

The /statistics endpoint can be used to get summary statistics for a geojson Feature or FeatureCollection.

In [7]:

Copied!





geojson_dict = {
    "type": "FeatureCollection",
    "features": [
        {
            "type": "Feature",
            "properties": {},
            "geometry": {
                "coordinates": [
                    [
                        [-20.79973248834736, 83.55979308678764],
                        [-20.79973248834736, 75.0115425216471],
                        [14.483337068956956, 75.0115425216471],
                        [14.483337068956956, 83.55979308678764],
                        [-20.79973248834736, 83.55979308678764],
                    ]
                ],
                "type": "Polygon",
            },
        }
    ],
}

r = httpx.post(
    f"{titiler_endpoint}/statistics",
    params=(
        ("concept_id", concept_id),
        # Datetime in form of `start_date/end_date`
        ("datetime", datetime_),
        # titiler-cmr can work with both Zarr and COG dataset
        # but we need to tell the endpoints in advance which backend
        # to use
        ("backend", "xarray"),
        ("variable", variable),
    ),
    json=geojson_dict,
    timeout=None,
).json()

print(json.dumps(r, indent=2))
geojson_dict = {
    "type": "FeatureCollection",
    "features": [
        {
            "type": "Feature",
            "properties": {},
            "geometry": {
                "coordinates": [
                    [
                        [-20.79973248834736, 83.55979308678764],
                        [-20.79973248834736, 75.0115425216471],
                        [14.483337068956956, 75.0115425216471],
                        [14.483337068956956, 83.55979308678764],
                        [-20.79973248834736, 83.55979308678764],
                    ]
                ],
                "type": "Polygon",
            },
        }
    ],
}

r = httpx.post(
    f"{titiler_endpoint}/statistics",
    params=(
        ("concept_id", concept_id),
        # Datetime in form of `start_date/end_date`
        ("datetime", datetime_),
        # titiler-cmr can work with both Zarr and COG dataset
        # but we need to tell the endpoints in advance which backend
        # to use
        ("backend", "xarray"),
        ("variable", variable),
    ),
    json=geojson_dict,
    timeout=None,
).json()

print(json.dumps(r, indent=2))

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              -20.79973248834736,
              83.55979308678764
            ],
            [
              -20.79973248834736,
              75.0115425216471
            ],
            [
              14.483337068956956,
              75.0115425216471
            ],
            [
              14.483337068956956,
              83.55979308678764
            ],
            [
              -20.79973248834736,
              83.55979308678764
            ]
          ]
        ]
      },
      "properties": {
        "statistics": {
          "2024-10-10T09:00:00.000000000": {
            "min": 0.3,
            "max": 0.99,
            "mean": 0.845157064600111,
            "count": 1725290.875,
            "sum": 1458141.771496357,
            "std": 0.1559272507275522,
            "median": 0.9,
            "majority": 0.9500000000000001,
            "minority": 0.36,
            "unique": 70.0,
            "histogram": [
              [
                34892,
                39574,
                38696,
                37867,
                44348,
                72817,
                110580,
                200188,
                472678,
                675707
              ],
              [
                0.3,
                0.369,
                0.43799999999999994,
                0.5069999999999999,
                0.576,
                0.645,
                0.714,
                0.7829999999999999,
                0.8519999999999999,
                0.9209999999999998,
                0.99
              ]
            ],
            "valid_percent": 57.18,
            "masked_pixels": 1293477.0,
            "valid_pixels": 1727347.0,
            "percentile_2": 0.36,
            "percentile_98": 0.99
          }
        }
      }
    }
  ]
}

Datetime string interpolation with the sel parameter¶

Datasets with more than two dimensions (e.g. x, y, time) will require the use of the sel parameter to pick a particular level of a dimension. Here is an example that shows how to get statistics for a single time slice of a granule from the TROPESS O3 dataset. This dataset has annual granules each with dimensions for time (monthly) and lev.

For this dataset, queries within a single year will return the same granule. If {datetime} is present in a sel query parameter value, titiler-cmr will pass the datetime query parameter value to the sel parameter by interpolating the string "time={datetime}".

In [8]:

Copied!





geojson_dict = {
    "type": "FeatureCollection",
    "features": [
        {
            "type": "Feature",
            "properties": {},
            "geometry": {
                "coordinates": [
                    [
                        [-20.79973248834736, 83.55979308678764],
                        [-20.79973248834736, 75.0115425216471],
                        [14.483337068956956, 75.0115425216471],
                        [14.483337068956956, 83.55979308678764],
                        [-20.79973248834736, 83.55979308678764],
                    ]
                ],
                "type": "Polygon",
            },
        }
    ],
}

r = httpx.post(
    f"{titiler_endpoint}/statistics",
    params=(
        ("concept_id", "C2837626477-GES_DISC"),
        # Datetime for CMR granule query
        ("datetime", datetime(2021, 10, 10, tzinfo=timezone.utc).isoformat()),
        # xarray backend query parameters
        ("backend", "xarray"),
        ("variable", "o3"),
        ("sel", "time={datetime}"),  #
        ("sel", "lev=1000"),
        ("sel_method", "nearest"),
    ),
    json=geojson_dict,
    timeout=None,
).json()

print(json.dumps(r, indent=2))
geojson_dict = {
    "type": "FeatureCollection",
    "features": [
        {
            "type": "Feature",
            "properties": {},
            "geometry": {
                "coordinates": [
                    [
                        [-20.79973248834736, 83.55979308678764],
                        [-20.79973248834736, 75.0115425216471],
                        [14.483337068956956, 75.0115425216471],
                        [14.483337068956956, 83.55979308678764],
                        [-20.79973248834736, 83.55979308678764],
                    ]
                ],
                "type": "Polygon",
            },
        }
    ],
}

r = httpx.post(
    f"{titiler_endpoint}/statistics",
    params=(
        ("concept_id", "C2837626477-GES_DISC"),
        # Datetime for CMR granule query
        ("datetime", datetime(2021, 10, 10, tzinfo=timezone.utc).isoformat()),
        # xarray backend query parameters
        ("backend", "xarray"),
        ("variable", "o3"),
        ("sel", "time={datetime}"),  #
        ("sel", "lev=1000"),
        ("sel_method", "nearest"),
    ),
    json=geojson_dict,
    timeout=None,
).json()

print(json.dumps(r, indent=2))

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              -20.79973248834736,
              83.55979308678764
            ],
            [
              -20.79973248834736,
              75.0115425216471
            ],
            [
              14.483337068956956,
              75.0115425216471
            ],
            [
              14.483337068956956,
              83.55979308678764
            ],
            [
              -20.79973248834736,
              83.55979308678764
            ]
          ]
        ]
      },
      "properties": {
        "statistics": {
          "2021-10-01T00:00:00.000000000": {
            "min": 12.448402404785156,
            "max": 30.805774688720703,
            "mean": 25.221195220947266,
            "count": 232.1199951171875,
            "sum": 5854.34375,
            "std": 3.496363112659205,
            "median": 25.481826782226562,
            "majority": 12.448402404785156,
            "minority": 12.448402404785156,
            "unique": 273.0,
            "histogram": [
              [
                2,
                1,
                0,
                7,
                36,
                38,
                45,
                46,
                40,
                58
              ],
              [
                12.448402404785156,
                14.284139633178711,
                16.119876861572266,
                17.95561408996582,
                19.791351318359375,
                21.62708854675293,
                23.462825775146484,
                25.29856300354004,
                27.134300231933594,
                28.97003746032715,
                30.805774688720703
              ]
            ],
            "valid_percent": 94.79,
            "masked_pixels": 15.0,
            "valid_pixels": 273.0,
            "percentile_2": 18.29606056213379,
            "percentile_98": 30.397979736328125
          }
        }
      }
    }
  ]
}

You can chose a different time slice from the same granule simply by updating the datetime query parameter.

In [9]:

Copied!





r = httpx.post(
    f"{titiler_endpoint}/statistics",
    params=(
        ("concept_id", "C2837626477-GES_DISC"),
        # Datetime for CMR granule query
        ("datetime", datetime(2021, 12, 10, tzinfo=timezone.utc).isoformat()),
        # xarray backend query parameters
        ("backend", "xarray"),
        ("variable", "o3"),
        ("sel", "time={datetime}"),  #
        ("sel", "lev=1000"),
        ("sel_method", "nearest"),
    ),
    json=geojson_dict,
    timeout=None,
).json()

print(json.dumps(r, indent=2))
r = httpx.post(
    f"{titiler_endpoint}/statistics",
    params=(
        ("concept_id", "C2837626477-GES_DISC"),
        # Datetime for CMR granule query
        ("datetime", datetime(2021, 12, 10, tzinfo=timezone.utc).isoformat()),
        # xarray backend query parameters
        ("backend", "xarray"),
        ("variable", "o3"),
        ("sel", "time={datetime}"),  #
        ("sel", "lev=1000"),
        ("sel_method", "nearest"),
    ),
    json=geojson_dict,
    timeout=None,
).json()

print(json.dumps(r, indent=2))

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              -20.79973248834736,
              83.55979308678764
            ],
            [
              -20.79973248834736,
              75.0115425216471
            ],
            [
              14.483337068956956,
              75.0115425216471
            ],
            [
              14.483337068956956,
              83.55979308678764
            ],
            [
              -20.79973248834736,
              83.55979308678764
            ]
          ]
        ]
      },
      "properties": {
        "statistics": {
          "2021-12-01T00:00:00.000000000": {
            "min": 18.230709075927734,
            "max": 37.22050476074219,
            "mean": 27.73312759399414,
            "count": 188.1199951171875,
            "sum": 5217.15576171875,
            "std": 4.665600495935212,
            "median": 28.000490188598633,
            "majority": 18.230709075927734,
            "minority": 18.230709075927734,
            "unique": 225.0,
            "histogram": [
              [
                1,
                23,
                40,
                21,
                17,
                20,
                44,
                35,
                15,
                9
              ],
              [
                18.230709075927734,
                20.129688262939453,
                22.028667449951172,
                23.92764663696289,
                25.826627731323242,
                27.72560691833496,
                29.62458610534668,
                31.52356719970703,
                33.42254638671875,
                35.32152557373047,
                37.22050476074219
              ]
            ],
            "valid_percent": 78.12,
            "masked_pixels": 63.0,
            "valid_pixels": 225.0,
            "percentile_2": 20.764251708984375,
            "percentile_98": 36.73855972290039
          }
        }
      }
    }
  ]
}

In [ ]: