Development Seed

Blog

Data.WorldBank.org Launched Today on Drupal

This is just the start of the World Bank's open data initiative

Signaling the most aggressive open data push in the international development community to date, World Bank President Robert Zoellick launched Data.WorldBank.org this morning ahead of the organization's spring meeting. All of the World Bank's 2,000+ data indicators are now open and freely available to the public. This new website is the browser for some of the most commonly used data, built to let researchers and policy makers filter though the vast data sets and quickly jump to indicators. The site currently includes 339 indicators from 209 countries, and this is just the start.

The open source data browser

Over the next several weeks we will work with the World Bank's team to add an additional 700+ indicators to the site. There is also a serious second phase road map that focuses on improving access to the the API and building capacity among a community of developers to better use the data.

We built the data browser using all open source tools. The site is powered by Drupal, the graphs use the Flot Javascript library for jQuery, and the map tiles are powered by MapBox, baked using TileMill, and served with OpenLayers. Here are some more details from the key sections of the site.

You can quickly sort indicators and map them:

Indicator page for "HIV Prevalence"

Every country has a dashboard page:

Country page for Nicaragua

All indicators are organized into topics:

Several environmental indicators mapped

For much of this data, this is the first time that it is available in languages other than English. The site is currently in Arabic, Spanish, French, and English.

//data.albankaldawli.org/

This site shows what can be done with off the shelf Drupal modules like Features, Context, and Views 3, and open source visualization tools like OpenLayers and Flot (there's no Flash anywhere). It's running Pressflow for performance, and is deployed on a high availability cloud hosting infrastructure. We will post more technical notes about how Data.WorldBank.org was built in a few days. If you are at DrupalCon, stop by our table for a demo of how we built the site.

Excel vs. CSV

It’s too bad the data is not in CSV format. CSV is an open standard while Excel is not.

Considering CSV

Hi Haisam,

In fact we are considering additional CSV downloads for phase 2 of this project. In the meantime though: Excel formats can be opened with Open Source tools like Open Office or Neo Office.

Alex

distributed memory caching?

Was it necessary to deploy a distributed memory caching system for the data?

Several caching layers

A distributed memory cache is not used. However, several caching layers exist. Varnish front-ends are used. The Apache back-ends are all running an Opcode cache, and the requests the Drupal site makes to the API server are all cached in Drupal. The API server where the data itself is stored runs Varnish in front of it. Since the data does not change, a very high Varnish cache lifetime is possible.

As a full-time economist and

As a full-time economist and Drupal hobbyist, I have to say this is an excellent site.

http://drupal.org/node/778932

For your future posts, I’m wondering how you store and access all the data. Surely each GDP observation is not a node. However, much of your site feels views-ish.

You’re correct that we’re

You’re correct that we’re using Views a lot. The site is powered by a custom Views 3 query object, which queries a private copy of the World Bank’s public API. So instead of querying MySQL, all of the data displays run REST queries against the API. This is very, very similar to the extender proof of concept I blogged about last year.

Its good to know that the

Its good to know that the power of views3 can be used on real sites and thats it not just a theoretically awesome concept.

Excellent data visualization ...

... and drupal at it’s very best. Kudos.

Very cool ! Thanks for

Very cool ! Thanks for sharing. Now I’m following you..

Kudos!

Fabulous work! Very informative and useful to data nerds (myself included) and the layperson alike. I’ve been showing this site to everyone. Very well done!

Openness & clarity

Excellent work, indeed. Finally both sides of the table on one page (if this makes any sense): Open data and sophisticated representation of it. Congratulations for your outstanding work!

–Benjamin

Good God. Can’t wait for some

Good God. Can’t wait for some more technical details.

How is the data handled? Through Feeds somehow I presume? Is it updated continuously, or are there scheduled import times when you get the new data?

As I just posted above we’re

As I just posted above we’re not actually aggregating this data, but using a custom Views query object to access it directly. The data display part of Drupal site is almost just a front end to the API system which actually holds the data.

Holy open data batman!

Holy open data batman!

Incredible detail

The amount of data that was organized into a clear and cohesive picture is incredible; not to mention fully standards compliant(sans flash). Really looking forward to the follow-up technical notes.