This morning the Afghan government announced that it will give approved researchers access to the raw data it collects in its demographic and economic surveys, expanding access from the prepared data sets it provides now. This is a good first step toward creating an open data plan in the country.
Robert and I were at the announcement in Kabul, where we’re gathering data to use on upcoming mapping projects in the country (more on that soon). This new policy is particularly exciting for us, as we’ll be able to use some of this raw data immediately in custom maps we are making for the Wolesi Jirga elections with the National Democratic Institute, a follow up to the work we did on last year’s presidential election. For example, access to the raw data in the “Natural Risk and Vulnerability Assessment survey” will give us better population data by district, which we’ll use to compare estimated voter turnout in the upcoming election.

Under this plan, called the “Micro Data Access Policy”, anonymized raw data sets like incomes, population, employment status, and food security will be available to researchers, allowing them to run their own calculations and to do a more in-depth cross tabulation of indicators. The president of the Central Statistics Organization Abdul Rahman Ghafoori, who made the announcement, sent out an email to the development community explaining the decision: “While we generate tables for general public and purposes, some researchers find them not sufficient for their research needs. Hence, we have decided to provide them with raw data but still within the boundaries of confidentiality as stipulated in Statistics Law of Afghanistan.”
Mr. Ghafoori repeatedly stressed that this is the first time users have access to raw data. Access to more granular data will not just improve the quality of analysis by third parties, but it will also offer practical benefits like being able to drill down intensely on data visualizations like we were able to do with the data we used on AfghanistanElectionData.org. This policy is also intended to help improve the data collected by increasing its user base and allowing users to share feedback and flag issues with the data.
But how open is this data policy?
Here is the text of the Micro Data Access Policy passed out at the press conference, which shows an intense terms of use for everyone who uses it and pricing information. The terms of use will be a serious barrier to use of the data, as it requires all users to have a statistician on their team, explain how they will use the data and keep it secure, and sign an agreement saying that if the terms are violated the user and their organization are liable under Afghanistan law. And on top of that, users have to pay for access. Clearly, this is not quite an “open” data policy.
But this is just the first run at it, and it seems like the Central Statistics Organization is serious about tweaking and improving the terms. Much of the conversation at the announcement focused on how to liberalize them quickly. Mr. Ghafoori was very clear that this terms of use is only temporary saying, “We are ready to receive your comments, your needs, at any time. We have decided to evaluate the policy after six months, so during the six months, whatever comments or ideas data users are having they can contact us and we will consider your recommendations and needs. After this six months we will reevaluate this policy. This policy will always be open for evaluation and your feedback and recommendations and these sorts of things.” Note: I was typing fast as the translation was coming through, so please forgive any errors.
I spoke with Mr. Ghafoori after his announcement, briefly discussing our work on data.worldbank.org (and mentioning the World Bank’s beautiful terms of use). It was my first time meeting him, and we also talked about our team providing further feedback on the terms of use in the coming days and some details on how we would want to have the data posted so it would be most useful (and open). As we push further into building tools that make use of open data, it is fascinating to see just how much the policy side of data releases impacts what can be accomplished on the tech side, and it’s great to be involved in these conversations early to help make sure there are good tech options in the future. My impression in this case is that there seems to be a real interest in getting feedback from users of the raw data, and maybe a real opportunity to relax some of the more stringent requirements.