How it works?¶
Lonboard is built on four foundational technologies: deck.gl, GeoArrow, GeoParquet, and anywidget.
- deck.gl is a JavaScript geospatial data visualization library. Because deck.gl uses the GPU in your computer to render data, it's capable of performantly rendering very large quantities of data.
- GeoArrow is a memory format for efficiently representing geospatial vector data. As a memory format, GeoArrow is not compressed and can be used directly.
- GeoParquet is a file format for efficiently encoding and decoding geospatial vector data. As a file format, GeoParquet contains very efficient compression, and needs to be parsed before it can be used. 1
- anywidget is a framework for building custom Jupyter widgets that makes the process much easier than before.
How is it so fast?¶
Lonboard is so fast because it moves data from Python to JavaScript (in your browser) and then from JavaScript to your Graphics Processing Unit (GPU) more efficiently than ever before.
Other Python libraries for interactive maps exist (such as ipyleaflet
), and even existing bindings to deck.gl exist (such as pydeck
). But those libraries encode data as GeoJSON to copy from Python to the browser. GeoJSON is extremely slow to read and write and results in a very large data file that has to be copied to the browser.
With lonboard, the entire pipeline is binary. In Python, GeoPandas to GeoArrow to GeoParquet avoids a text encoding like GeoJSON and results in a compressed binary buffer that can be efficiently copied to the browser. In JavaScript, GeoParquet to GeoArrow offers efficient decoding (in WebAssembly). Then deck.gl is able to interpret the GeoArrow table directly without any parsing (thanks to @geoarrow/deck.gl-layers
).
GeoPandas is the primary interface for users to add data, allowing lonboard to internally manage the conversion to GeoArrow and its transport to the browser for rendering.
Lonboard's goal is to abstract the technical bits of representing and moving data so it can attain its dual goals of performance and ease of use for a vast audience.
-
For subtle technical reasons, Lonboard's internal data transfer doesn't match the exact GeoParquet specification. Lonboard uses the highly-efficient GeoArrow encoding inside of GeoParquet instead of storing geometries as Well-Known Binary (WKB). While the GeoParquet 1.1 spec does support a "native" GeoArrow-like encoding, note that GeoArrow defines two coordinate layouts: "separated" and "interleaved". Only "separated" is allowed in the GeoParquet spec because only the "separated" layout generates useful column statistics to be used for cloud-native spatial queries. However, deck.gl expects the "interleaved" layout. So Lonboard prepares Arrow data in that exact format to avoid an extra memory copy on the JavaScript side before uploading to the GPU. ↩