Geospatial reprojection in Python
Work-in-progress guidebook and profiling results (Sept. 2024).
Authors + Credits: Max Jones, Optimized Data Delivery team (especially Aimee), Pangeo Community (especially Justus and Michael)
A bit of background about me and this work
Caveats
- Work in progress!
- Recording will quickly become out-of-date
- Verify/fix code before use
Definitions
Reprojection - changing the projection of a dataset from one coordinate reference system (CRS) to another
Definitions
Resampling/regridding - changing the grid structure (often resolution)
Definitions
Warp resampling - changing the resolution and projection of a dataset
Grid structures
- Rectilinear - described by one-dimensional latitude and longitude coordinates
- Regular - described by one x,y coordinate and the resolution
- Curvilinear - described by two-dimensional latitude and longitude coordinates
- Unstructured - Grids in which the grid coordinates require a list of nodes
Resampling algorithms
- Nearest neighbor
- Bilinear
- Cubic
- Spline
- Inverse distance
- Bucket / binning (average, min, max, mode, med, quartile, sum, rms)
- Spectral
- Triangulation
- Conservative
Some of the many reasons to warp resample
Co-registering datasets
- Mosaicing
- Statistical analyses
- Machine learning
Visualization
- Rendering (minimize distortion)
- Building overviews
Observations and opinions
- Lots of kernels were killed in the making of this presentation
- we need a demo using a bounded-memory approach (Cubed!)
- There are some awesome data cube libraries in Python
- let’s work with the developers to make them even better…and not build another one
- Xarray’s data model is intuitive for a lot of people
- use accessors to extend it’s functionality rather than a new data class
What’s next for the guide
- Try caching weights
- Small tile from a large dataset
- Add information about grid structures supported
- Add information about resampling methods supported
- Test with virtualized data
- Test with cloud optimized data
- Test with other resampling algorithms
Thanks
- Development Seed
- Pangeo Community
- special thanks to Justus, Michael, and Deepak
- NASA IMPACT
What’s next for resampling in Python
Let’s discuss!