Mapping and Referencing External Data via RDF in Drupal


5 min read

Say you are a development agency with on-the-ground operations, all of which use their own websites to collect geo rich data. You want to be able to associate this data from your on-the-ground programs with private content that’s on your organization’s main intranet back here in Washington, DC (for instance, staff data). Beyond that, you want to view external data and related private content together on a map. If you think about it, this example boils down to the question, “How do you map and reference external data in an unambiguous fashion?”

We spent the last couple of days pulling together the pieces that were missing to answer this question in Drupal. Here’s a rundown of the approach we took and the modules that we rolled out for it.

The Recipe

The basic architecture we came up with is to aggregate geo rich data (KML or GeoRSS) into an RDF data store and display it with Views and Nice Map. For referencing RDF data from nodes, we used the Relations API. For aggregation, we used FeedAPI with KML Parser and FeedAPI RDF processor.


RDF Storage

Let me start by explaining why we chose RDF (Resource Description Framework) as our storage solution. RDF defines a very flexible and clear model for describing information. There is a lot of RDF work going on in Drupal these days and the availability of the excellent RDF module made it a clear choice for handling and storing data.

Using RDF allows us to reference an item that’s onsite or offsite, or that’s a concrete thing or an abstract concept. It allows us to describe our data unambiguously beyond the boundaries of our website (“hey, that string’s a city, you know?”), and it doesn’t require us to know about the kinds of things we’d like to model in advance.

To store RDF from data feeds we had to write a module that generates RDF statements from feed items: the FeedAPI RDF processor. As its name says, it’s an add on to FeedAPI. It takes single feed items (from any kind of feed you can aggregate with FeedAPI), creates a series of statements that capture the contents of the item, and puts the information in the RDF module’s data store.

We wanted to be able to leverage the Views module to query this RDF data in interesting ways. To do this we’ve begun working on a patch to the RDF module which adds basic Views module integration, which you can find in the RDF module’s issue queue.

KML Parser

In order to populate this RDF store with geographical data, we need to parse data from various sources. FeedAPI’s SimplePie parser has supported GeoRSS since version 1.5, but we also wanted to be able to consume KML feeds. So we wrote a KML Parser for FeedAPI. It’s still in its infancy and currently only supports KML Placemark parsing, but this is more than enough for the use case at hand.

Nice Map

I talked about Nice Map extensively in my last blog post. Nice Map is a Web Mapping Service for Drupal. It pulls maps from mapping services and renders content on them with the help of the Views module.

Together with RDF’s Views integration it was possible, although a bit tricky, to place RDF data bits on the map just as we did with nodes in projects before.

Relations API

The Relations API relates nodes to other nodes with RDF. We extended it to not just relate nodes to nodes, but to relate nodes to RDF data itself. This is a very powerful concept because suddenly we can use a node (the main type of content in Drupal) to refer to anything you can describe in RDF, whether it’s on the site or not. We’re not yet sure how or when these changes will find their way back into the Relations API, but in the meantime you can use our proof-of-concept module in our sandbox.

Open Issues

This was our first stab at addressing the gaps in building an RDF based tool for referencing geo rich data. Along the way we rolled out two new modules (KML parser and FeedAPI RDF processor) and patched two others (RDF’s views integration and Relationship API). This is a great step towards a flexible “drop in and configure” solution that is geo-enabled and RDF based. Of course some open issues remain, and here are the most important ones:

  • KML Parser needs to be tested with more feeds. It needs to support very large and complex feeds.
  • FeedAPI RDF processor has a hook based mapping API for mapping feed data to RDF triples. There is no UI for this. Ideally, like Feed Element Mapper, it would support not only node-based mapping but any kind of feed item mapping.
  • RDF views support needs more consideration and work.
  • Our Relations module add-on, Relations_rdf, needs to find itself a home. Hopefully that’ll be as part of the Relations module, but it’s not immediately clear the best way to move forward.

The short is that the current RDF family of modules within Drupal offers really spectacular tools that already make a lot of the functionality we just talked about possible. (Thanks Arto!) As you can see, there is a lot more work actively happening here that will push the limits further and we are looking to expand our RDF related collaboration and conversations with both developers and development practitioners.

What we're doing.