An open platform for air quality data

Client: With OpenAQ

OpenAQ is an open-source platform that strives to end air inequality around the world. Working with the OpenAQ team, we built a platform to aggregate, standardize and publish the world’s air quality data. This allows developers, researchers, journalists, and educators to more easily visualize the data and build impactful tools.

Poor air quality is responsible for 1 out of 8 deaths in the world, more than malaria and HIV/AIDS combined. Access to air quality (AQ) data can dramatically reduce that number, helping communities mitigate exposure and allowing low-cost sensor/citizen science and public health communities advance and galvanize people into action. Over 4,000 AQ stations publicly publish their data throughout the world yet most people will never discover it. This data is rendered inaccessible by being housed on obscure websites, being programmatically difficult to access, or because it is saved in formats that are displayed inconsistently, causing errors. Today, even analyzing air quality for a megacities like New Delhi or Beijing requires scrounging and scraping data from multiple sources, not an easy or straightforward task.

OpenAQ addresses this challenge by aggregating and displaying air quality data over time from data sources around the world. Data is stored in a standardized format and the full archive is freely accessible through a public API.

Robust infrastructure

With thousands of air quality measurements available, we knew that to build a successful platform we would need to consider how OpenAQ scaled from the beginning. Less than a year into the project, OpenAQ collected close to 100,000 measurements per day and served roughly 500,000 API requests a month. A number that keeps growing as new sources are added. While an impressive start, that 100,000 a day makes up only about 10% of what is available from official sources worldwide, not including data from low-cost sensors from individuals.

To address this challenge, OpenAQ is built as a set of lightweight components. The system leverages different AWS services — Lambda and ECS to fetch new data, and CloudFront to serve the static files and data with optimal performance around the world. Since OpenAQ produces aggregated data over the whole dataset, a costly and time consuming operation for each request, we built components to smartly update internal caches only when new data has been added to the system. We also made sure that all requests run against this temporary data store, rather than the database, leading to faster queries and opening up more functionality.

A community

More than just the code - all available via GitHub - OpenAQ is building a community of people committed to ending air inequality around the world. Find more information at openaq.org, by following along on Twitter at @open_aq, or by joining the Slack channel.

Continued work on OpenAQ is made possible in collaboration with the National Institutes of Health and Wellcome Trust via the Open Science Prize, Amazon Web Services and the American Geophysical Union through the Thriving Earth Exchange and through the Earth Journalism Network.

Have a challenging project that could use our help?

Let's connect

We'd love to hear from you.