Blog: Aggregation

Improved Aggregator for Drupal 7: What's Under the Hood
engineer

An Overview of Its New Features and a Request to Test Drive It

An Overview of Its New Features and a Request to Test Drive It

The patch for an improved aggregator for Drupal 7 is now available on Drupal.org #236237. This code is result of Aron Novak's Google Summer of Code project and it is available as a Drupal project with regular patches against Drupal HEAD #236237. The patch has been out for a couple of weeks, so it's high time to talk about what improvements it aims to bring to Drupal core.

Before I dive into the details though, I'd like to point out that several people requested to break the patch into smaller pieces as it is rather big and touches on more than one functionality of the aggregator. We yet have to work on this, however I do think that there is a value in presenting the proposed improvements as a whole. So here we go.

There are four major differences in comparison to the existing aggregator:

  • Extensible architecture - allows external modules to add or replace functionality
  • Per feed content type configuration of aggregator
  • Replaces aggregator's XML parser with a SimpleXML based parser
  • Replaces aggregator's category system with taxonomy

1. Extensible architecture

This change is the widest reaching of all. At its core, there are the concepts of parsers and processors for aggregator. Parsers download and parse feeds, normalize feed data and expose it to other parts of the application. Processors grab feed data and act on it. For example they create database records for feed items.

In order to define a parser or a processor, one of two hooks need to be implemented:

  • hook_parse() for defining a parser
  • hook_process() for defining a processor

According to the parser/processor architecture, the new structure of aggregator is as follows:

  • aggregator module - implements API and standard download routines
  • syndication_parser module - standard RSS/Atom/RDF parser module that is supposed to ship with core. Can be used independently from aggregator.
  • aggregator.light.inc (part of aggregator module) - this implements a processor that stores feed items as lightweight database records just as the current aggregator does

The implementation of parsers and processors can be seen in syndication_parser module and in aggregator.light.inc. There is also a feed-items-as-nodes implementation in the project version of aggregator for Drupal 7. To my knowledge, the parser/processor architecture was first introduced in Drupal by Ted Serbinski with SimpleFeed, and it also exists in FeedAPI.

Current discussion points around the extensible API are:

Surviving Information Overload: FeedAPI Mail Watches Your Mailing Lists
Multilingual Engineer

FeedAPI Mail Plugin Lets You Aggregate Your Email With Your News Feeds

FeedAPI Mail Plugin Lets You Aggregate Your Email With Your News Feeds

Keeping up with all the information we get every day is no easy job. We follow hundreds of websites using feed readers and get hundreds of emails every day. If that wasn't enough, there are new sites everyday producing their own streams of information, sites like Flickr, Facebook, Twitter, and probably someone is launching a new one as I'm writing this. We definitely need help to manage all this information, so we use tools like email filters, feed aggregators and readers, and tagging tools.

Just one of the tools mentioned above means one more piece of software or website to deal with. And that means that when we find some other tool that can mix what two tools do into just one, we have at least some hope that the next day we'll have some spare time to do the real work. That's what FeedAPI Mail does for FeedAPI in Drupal -  adds one more functionality to the mix. 

Now we have the first proof-of-concept module for a FeedAPI plugin that adds mailing lists into the equation. With this plugin, you can follow mailing lists and enjoy the same features available for the other content. It treats individual mailing lists as if they were web feeds, allowing the same features for incoming emails as for the rest of the content, including automatic tagging (and geotagging), reading the emails in a single news reader, easy sharing with your team, and so on. Here's a look at it running in our team news aggregator/analyzer Managing News:

This FeedAPI Mail module works together with Mailhandler and FeedAPI, which grabs the rest of the feeds and presents them in a single web interface. In addition it has also a specific mail reader UI that can display threads and authors, which puts single emails in context.

To get started we need a mail account and to set up a mail inbox with Mailhandler so emails are read and made into nodes. Next we need to add a 'Mailing list' and set it up with the mailing list email address so incoming emails can be classified as coming from different lists. Then we just need to subscribe our email address to the mailing list (which may take some manual steps to handle confirmation emails), and we're all set.

What's to Come with FeedAPI Mail

DrupalCon Szeged Session Proposals: Aggregation, Context and Spaces, Messaging and Notifications, and Drupal Talent
Communications Strategist

Vote on the Sessions You Want to See at DrupalCon

Vote on the Sessions You Want to See at DrupalCon

We're getting excited to come over for DrupalCon Szeged in just over a month. This will be the fifth DrupalCon that Development Seed has attended, and it always amazes me how much these conferences show off just how fast this community is growing and how far Drupal has come as a platform. We've looking forward to talking about some of our latest work to add and refine functionalities to Drupal. Here's a quick summary of the sessions we've proposed to lead. If you're interested in these topics, please vote for them!

Spaces and Context Modules: Tools for Site Building: The Context and Spaces modules are two relatively new tools in Drupal's aresenal that make it easier to build complex websites. In this session, Jeff Miccolis will talk about both modules' approach and show developers how they can be used and extended. He'll also show some examples of the modules in use on community portals, sites-within-sites, and intranets. Vote here.

Messaging and Notifications Frameworks: At DrupalCon Boston, Jose Reyero introduced the beta versions of these frameworks. A lot has been done to improve them since, and in this session Jose will talk about the upgrades, specifically in how they handle subscriptions, notifications of events, and the various delivery methods for sending messages. Another focus of this session will be to discuss the shift way from email only delivery methods to multi-platform methods. Vote here.

A New Aggregator for Drupal 7: Drupal's core aggregator is getting a revamp in Drupal 7. In this session, Aron Novak and Alex Barth will talk about why this step is needed and what you can expect from the new core aggregator. Aron began this work as a Google Summer of Code project last summer and this summer has continued to finetune it. The result is a simple yet extensible and efficient architecture that should serve Drupal well. Vote here.

Attracting and Retaining Drupal Talent: At the rate Drupal's popularity is growing, we're finding that there just aren't enough developers to meet the demand. Web shops and organizations are coming up short in finding the Drupal talent they need to build and run the online tools they want. This session will look at ways to beat out the hiring competition to find and retain Drupal talent. Eric Gundersen will talk about how Development Seed has grown our team, and Kris Krug from Raincity will share his experiences and lessons learned.

You can vote on all the sessions (including BoFs) you want to see at DrupalCon Szeged here. See you in Szeged!

Why the heck a new aggregator for Drupal 7?
engineer

Or, Check Out the Patch

Or, Check Out the Patch

This year’s Google Summer of Code season I’ve got the distinct pleasure of mentoring Aron Novak’s work for a new aggregator in Drupal 7. Aron’s well into his task and has just rolled a patch for core and an alpha 2 version – time to share why I think that this patch is important and why you should have a look at it. If you’re into aggregation and Drupal, that is.

Drupal’s original aggregator module was designed foremostly for pulling news feeds into your site and displaying in a straightforward fashion: no workflow, very basic permissioning, no API for interacting with feed items, no event aggregation, no custom parsers – to name a few limitations.

Soon contrib modules mushroomed that addressed one or the other shortcoming of the core aggregator: a list of them would start with the aggregator 2 module which was published in the fall of 2005 and would include Leech (I don’t regret its demise), Aggregation (first time use of SimpleXML for parsing in Drupal), SimpleFeed (first extensible architecture) and FeedAPI

Debuting Managing News at Rootscamp DC
Communications Strategist

Talking with Bloggers, Campaigners, and Progressives about Tracking New and Traditional Media

Talking with Bloggers, Campaigners, and Progressives about Tracking New and Traditional Media

A lot of campaigners in Washington, DC for Rootscamp just got a peak at Managing News - our news tracker and analyzer tool. If you missed the session, you can check out some videos of the tool here and contact us for a seven day free trial.

The session led into a great discussion on what people in the political realm need with a tool like this - particularly extreme media analyzers and bloggers. That there's a need to track online chatter on both a macro and a micro level wasn't in question - the people in the room knew that was necessary and it's been a recurring theme in all the sessions I've been to so far today. This in itself is really great to hear because we built Managing News to do just that, and to give a snapshot view of what’s happening online around a topic and make it easy for people to take action on it.

There was some good excitement about Managing News' ability to roll a lot of what people are already doing with Bloglines, Google alerts, and manual news reading into one shared platform, and that one that automatically adds information like keywords and locations to the news it sucks in. We also heard some great feature requests - like source exclusion, clipping reports printed in various formats, and close captioning tracking - that will be keeping us busy for awhile : ) Let us know if you have a specific news tracking or analyzing need. We're happy to talk about what Managing News and other tools can do to help.

Managing News is Hiring!
Strategist

Help Us Work on This New Tool

Help Us Work on This New Tool

Managing News is a Drupal powered news tracker and analyzer that helps communications teams and PR firms monitor news and trends across the internet. We're looking for a few smart, creative engineers to join our tight knit team to work on this new tool. If you're interested in Drupal, the news, aggregation, and working on a totally different type of tool, check out the job description. Drop us a line if you want to learn more: jobs@managingnews.com.