Geospatial Python
An overview of python methods for geospatial data relevant to doing machine learning with satellite data.
The following material covers the basics of using spatial data in python. The main goal is to become familiar with the libraries used, and to try a few examples of operations with vector, and raster data, including some basic visualizations.
Vector Data
Note: A GeoDataFrame is a pandas DataFrame with geometries (GeoSeries)
- How to load and save spatial data with Geopandas
- General Data Manipulation (Geopandas)
- Subsetting by Attributes, to select records based on attributes use the techniques from Pandas
- Projections
- Intersects
- Spatial Join
- Spatial Aggregation
- Derive Centroids
-
Bounding Box
- For each row in a GeoDataFrame
GeoSeries.bounds
if you want to extract the coordinates. - For each row in a GeoDataFrame if you want another geodataframe you can do spatial operations with
GeoSeries.envelope
- For a whole GeoDataFrame
GeoSeries.total_bounds
- For each row in a GeoDataFrame
import geopandas
url = "https://d2ad6b4ur7yvpq.cloudfront.net/naturalearth-3.3.0/ne_110m_admin_0_countries.geojson"
countries_gdf = geopandas.read_file(url)
print(countries_gdf.head())
countries_gdf.total_bounds
countries_gdf.head().bounds
countries_gdf.head().envelope
Making Maps
Optional Bonus Material
- Advanced Vector Input/Output(I/O) with Fiona
- Using Spatial Indexes for faster spatial operations
Raster
-
How to load and save data
- Rasterio (Reading)
- Rasterio (Writing). The recommended default writing Profile is a Cloud Optimized Geotiff, as shown with the rio-cogeo library.
-
Numpy Arrays (Rasters)
- Clipping raster by AOI
- Band Math (aka Map Algebra)
- Sampling data (extract) from raster with a vector