With the launch of Managing News we have released Feeds, the intended successor of FeedAPI. Feeds is a next generation import and aggregation API that applies lessons learned from three years of intensive work with aggregation in Drupal. This is one of the outcomes of our work on Managing News that we are most excited about. We’d like to again thank the Knight Foundation for their vision to support the improvement of fundamental aggregation tools for Drupal, which helped create Feeds. In this post I’d like to explain the reasons for building a new aggregation and import API, the design goals for it, and what this is going to mean for FeedAPI.
It’s over two years now since a conversation at OSCON led Ken Rickard to post the Aggregator API proposal. That resulted in the successful Google Summer of Code project, Feed API, which has been developed and maintained largely by Aron Novak. Since then, we have used FeedAPI on many (perhaps most) of our projects and we have dedicated significant resources toward improving and extending it.
Over time, the flexibility of FeedAPI’s architecture started to pay off. For instance, we used it to aggregate RSS and Atom in Managing News, import compressed CSV crime feeds for Stumble Safely, and create events from iCal feeds for Open Atrium. Feed Element Mapper gave us granular control over mapping feeds to Drupal content. A series of contrib modules mushroomed to plug into FeedAPI to offer additional functionality.
However, as we addressed these and many other use cases with FeedAPI, its limitations became clear.
Limitations of FeedAPI
- FeedAPI assumes that feeds are loaded via HTTP; file support has been added, but still feels alien.
- Parsers are responsible for fetching feeds; fetching mechanisms (HTTP or file) are hard or impossible to reuse.
- Exotic feed formats forced many FeedAPI developers to fork parsers.
- Mapping is done by a separate module which leads to complex code.
- FeedAPI assumes an aggregation use case. Feeds as nodes and a baked-in scheduler make sense in this context, but there is no reason for it to be the only use. Aggregation is essentially just periodic import.
- FeedAPI is a child of PHP4. As such, object-oriented techniques that would allow for a more efficient API aren’t used.
- FeedAPI configurations are not exportable to code which hampers development workflow and makes FeedAPI awkward to use in install profiles.
As many of these issues slowed down day to day development for us, design debt became critical to address.
The goals for the rewrite
We took the launch of Managing News as an excuse to push hard on a rewrite of FeedAPI with these goals in mind:
- Better and cleaner extensibility
- Suitable for aggregation and data import jobs
- Comparable or better performance
- Suitable for pulling from files, pulling from HTTP or pushing through HTTP
What’s in the box
On October 20th, we released the first version of Feeds. If you download Feeds today you’ll get:
- Fetching data from file or HTTP sources
- Parsing RSS, Atom, CSV or OPML
- Producing either simple database records (with Data module), nodes, users or terms
- Use an importer configuration either by creating a node or on a stand alone form
- Presets for the most common aggregation and import tasks
- Integrated mapper for mapping feeds on a field level to CCK nodes, users, terms or SQL tables
- A views-style OOP plugin API that makes it easy to tweak existing plugins or create your own (big KUDOS to merlinofchaos for CTools’ plugin API)
- A views-style export-to-code functionality for configuration (again, hats off, merlinofchaos: CTools export API makes it easy to add exportables to a module)
- Support for concurrent feed aggregation with Drupal Queue — a backport of the great queue in Drupal 7 contributed by chx et. al.
A glance at Feeds’ admin page shows the default configurations Feeds ships with. See more screenshots.
Beyond what’s already available, there are a couple of interesting potential additions and improvements to Feeds that are planned or that we are very interested in getting in:
- Pubsubhubbub/RSSCloud support #617054 (the best path to scalable aggregation with Drupal, in my opinion)
- Use Batch API for import / deleting content #600584
- Better performance logging #606612
- Drush integration #608408
What does this mean for FeedAPI?
Aron Novak and I will start to phase out FeedAPI maintenance effective with this blog post. The goal is to find a new lead maintainer by the end of the year and to keep additional features in FeedAPI to a minimum. The same applies to dependent modules like Feed Element Mapper or CSV Parser that we are maintaining.
As far as I can see there is no reason to pull FeedAPI into Drupal 7. It can be completely replaced by Feeds for D7 (of course, this is open source code and anybody is free to step up and upgrade FeedAPI). Feeds supports the #D7CX movement, and there will be a full D7-compatible version of Feeds the day Drupal 7 is released.
A FeedAPI upgrade script to Feeds (#596584) is planned, but we will need serious help with testing and maintaining it. At Development Seed, we do not have many legacy FeedAPI systems that need to be upgraded, so we will have a shortage of “guinea pig” sites for testing corner cases.
You will notice that Feeds still carries the ‘alpha’ label. The main reason for this is that we would like to keep the flexibility for adjusting Feeds to new use cases as early adopters work with it. The overall architecture is solid at this point, and we are using it in a series of production sites including Managing News. It is currently at a level where feature compatibility between minor versions can be guaranteed, but cautious changes to the API may still occur.
I would like to encourage everybody who has a stake in Drupal aggregation or content importing to take a close look at Feeds. This is a module that could make your life much easier. :) Up until the first beta release, API adjustments to accommodate your use cases are still possible, so your contributions and feedback are always welcome. I’ll look forward to interacting with everyone more in the queues.