Speed Optimizations in TileMill: Shapefile Indexes


3 min read

This is the first in a series of blog posts sharing ways to speed up Mapnik rendering in TileMill.

We recently worked with very large shapefiles while creating the map tiles for the U.S. Department of Education’s new broadband mapping initiative. When a shapefile layer contains many objects, gains in speed can be made in TileMill by ensuring an index file is present in the shapefile collection. An index file acts as a guide for the elements within a shapefile, allowing the renderer to quickly figure out where to look for the parts of the file it needs rather than slowly searching through the whole file.

On the Department of Education’s map of about 100,000 school locations, the render time of a 12-zoom-level MBTiles file where we forgot to index the shapefile was over a day. The same render with that shapefile indexed took just over an hour. While results will not always be this dramatic, they should usually be noticeable.

The index format that Mapnik uses has the extension .index, and it’s created using a command-line tool that comes with Mapnik called shapeindex. Its use is very simple:

`shapeindex your-shapefile.shp`

That’s it. It will create a new file called your-shapefile.index that will need to be kept with the rest of your shapefile collection (*.shp, *.shx, *.dbf, etc.). (Reminder: this collection should be zipped if you want to load it from a remote URL or from a Library in TileMill.)

We currently recommend that you only index the largest three or four shapefiles in your project because having too many shapefiles indexed can trigger a bug in Mapnik that will cause Too many open files errors. If you run into this, you can also try increasing your system’s user process limit by running the command ulimit -n 2000. This should only be required temporarily as the bug will be fixed in the near future.

Another thing to keep in mind is that the *.index file is specific to Mapnik and will most likely not be recognized by any other renderers or geographic information systems. If you update your shapefile, you will also need to update the index by re-running the shapeindex command. Indexes are one of the simplest ways to speed things up in TileMill.

This spring we will be stepping up our efforts to create documentation for TileMill and other MapBox tools. We are still working on a home for this documentation, but in the meantime watch the Development Seed blog for additions, or follow @MapBox on Twitter.

What we're doing.