We're working on adding fast interactivity to data-heavy maps. You can see this working on two historical mapping sites we launched with the National Democratic Institute for the 2004 and 2005 elections in Afghanistan. In short, it's now possible to hover over map tiles and see the specific data they contain.
This kind of interaction - which provides hovering and clicking behaviors to otherwise flat raster map tiles - is now possible without using Flash while still supporting older browsers like Internet Explorer. This is the result of a great deal of work involving raw performance, data interoperability, and compatibility across systems - taking advantage of the OpenLayers mapping library and the Mapnik map rendering toolkit.
How this works
Drupal and other content management systems can manage numeric data within its own structures. In the case of the Afghanistan election maps, the Data module contains the tabular data that eventually colors the map. When each map is initially viewed, data from Drupal is pulled onto the TileLive map server hosted by MapBox.
TileLive draws tiles that are not yet cached and directly serves tiles that it has rendered before - making cached map performance extremely high. TileLive also generates "tiles of data" that indicate what features exist where in each tile, and saves them as JSON. Then using a simple collision-detection algorithm, the site can determine which, if any, feature the mouse is hovering over and deliver underlying data for popups and navigation.
There are a few important notes to the implementation.
Many other interactive maps use Flash, and for good reason - it provides a fast vector-drawing system that functions somewhat across browser and across platforms. Flash makes polygon interaction somewhat inexpensive, as well.
Low Requirements, High Speed
The server component of this interactivity, called TileLive, is designed to scale to thousands of concurrent users, thanks to a few simple aspects of its design:
- Cacheability: After a request for an image or interaction data is filled, the file is stored to a cache. After that request, that file can be served directly from the web server without invoking the application. Currently cached files are served by the minimal and fast server nginx. One of the important elements of this approach is that the JSONP callbacks are made to be predictable on both sides due to perfect hashing. In the future, this means that we can use a proper CDN for interaction data.
- Lazy loading: Interaction data is only loaded on a mouse hover. For cached data, which is rarely larger than 1K, this loading will seem instantaneous. And for users who don't scroll to a map below the fold or who don't use the map, having interactivity available doesn't slow down page loading.
- TileLive and the StyleWriter module (more on that below) enable data to be joined together. We are now working to add this into data.worldbank.org, which has thousands of indicators from an external API that can be symbolized in different ways. The traditional model of the map server owning all data in a single place falls flat in the highly linked area of large datasets. Permitting data to be joined on the fly and map styles to be created outside of the map server fits these difficult use cases much more closely.
As far as data, this system is designed for applications that don't necessarily own their own geodata. The assumption that map servers have full access to map data and configurations from startup is extinguished and replaced with real-world abstraction of data, geodata, and configuration. This means that our map servers are not necessarily the same as servers for Drupal - and that a single Drupal site can use multiple map servers for performance. We see MapBox potentially offering this as a service down the road.
Consistent Hashing with JSONP
Integration with Drupal
Drupal integration happens through the StyleWriter module. Any data source that integrates with the Views module, like the Data module, can be used to power choropleth, point, and scaled point maps via StyleWriter. TileLive and the interaction stack is completely modular, so the Drupal StyleWriter module is only one possible provider of Cascadenik styles to TileLive.
Grids and MetaWriters
Two different kinds of data are returned from this stack: grid data for polygon interaction and GeoJSON for point interaction. These two are generated by different mechanisms and for different reasons.
MetaWriters are a new feature of Mapnik2 developed by Hermann Kraus that enable the output of GeoJSON data along with image data - so that everything Mapnik renders is accompanied by a record of the actual features rendered and where they were placed within a map tile.
In this implementation, the GeoJSON written by Mapnik is immediately wrapped as JSONP. Then GeoJSON is loaded tile-by-tile and stored with each tile object. Thus thousands of points can be scattered across a map, but only those of interest to the user are loaded, and only those within the currently-hovered tile are iterated through at any given time.
Polygon interaction is powered by grids: essentially pixel grids which indicate which polygon is contained in which four-pixel square of an image. These are compressed by both run-length encoding and gzip compression on the Nginx side, making the download size small enough to be generally unnoticeable.
The other benefit of grids for polygons is that interaction can be extremely fast: once a grid is associated with a tile, determining the specific district or country the user is hovering over can be done in a single calculation with no loops - making it seem instant.
This is an early technical look at our most recent MapBox investing. While we are deploying this for production ready sites right now, this is still really technical to setup. We are working on making this easier.