Pressflow — a Drupal distribution that provides improved performance and scalability and which is particularly useful for high traffic sites — continues to develop and the guys behind it over at Four Kitchens have some exciting plans for Pressflow 7 and beyond. After Jeff’s post outlining how important speed is for the data heavy sites we build and why we use Pressflow and Varnish to make them faster, I wanted to dig in to find out what’s next for Pressflow. I talked with David Strauss, the creator of Pressflow, to get the low down on their plans for the project in the New Year.
Q: I thought all the changes from Pressflow were ported into Drupal 7?
A: Pressflow 7 and later will continue to provide significant improvements over Drupal’s performance and scalability. Even maintaining existing Pressflow 6 features will keep Pressflow ahead of Drupal 7 in performance and scalability. But yes, most of the main Pressflow 6 features are in Drupal 7 or have equivalents. We’re glad that’s the case; it opens the door for new development on Pressflow while maintaining good compatibility with Drupal. Drupal 6 included changes that were in Pressflow 5.
Right now there is work underway to get as many Pressflow 6 changes into Drupal 7 as possible. We’ve been having discussions with Angie, Dries, and a bunch of core developers over what’s viable to merge.
Q: What exactly will Pressflow 7 give me that Drupal 7 will not?
A: Pressflow supports multi-tier proxy layers, which is in use by several major sites. No version of Drupal (including the one in development) can properly handle this architecture. But Dries wants this support in Drupal, so I wouldn’t be surprised if it makes the final Drupal 7 release.
Drupal 7 also lacks solutions for common, slow queries that are optimized in Pressflow in a MySQL-specific way. Last-minute Drupal 7 work is underway there too, but that work has continued for a long time (generally since Drupal 6 development). So, it’s not clear if those optimizations will make it into the upcoming Drupal release.
Porting existing Pressflow 6 features to Pressflow 7 is far from the final word. Like Pressflow 6, 7 will integrate valuable improvements from Drupal’s ongoing development as well as original work.
There a difference in the missions behind Drupal and Pressflow. Drupal provides broad support on even shared web hosts. Pressflow captures the leading performance and scalability edge by using the latest infrastructure advances, and we’re willing to break support for older technologies to do that. That means we can move forward faster, but also that Pressflow will never be a system for basic sites.
Q:For people new to Pressflow, when should they use it as a substitute to Drupal core?
A: Pressflow can speed up any Drupal site to some degree, but the impressive changes generally require root access (or equivalent) to install supporting services. Pressflow really shines when it’s integrated with APC, memcached, Varnish, MySQL replication, and other extended family members of LAMP that aren’t on regular shared hosts.
For module authors, Pressflow can help the transition to the next version of Drupal. For example, Pressflow 6, like Drupal 7, supports database replication and features a smart session and page caching architecture. Modules that run properly in Pressflow 6 are less likely to have issues porting to Drupal 7 and are better poised to take advantage of Drupal 7’s new features. Most of Pressflow’s extended APIs are either pulled from the next Drupal release or implemented in a compatible way.
Q: What is the latest Pressflow work?
A: Ongoing Pressflow development focuses on two classes of problems:
(1) Benchmarks show Drupal 7 being considerably slower than Drupal 6. This is a much larger hit than the one we had from Drupal 5 to 6. The Drupal core team (including us) has placed a high priority on remedial work here, but it’s unlikely Drupal 7 will close the gap. Because the overhead is largely on the PHP side, Pressflow is exploring ways to accelerate common functions by offloading select parts of core to Java (which blows away PHP + APC on a modern Java VM) and performing expensive page assembly and caching operations with systems like Varnish’s ESI and nginx’s SSI.
(2) The sites running Drupal and Pressflow are bigger than ever. Certain components that ran well on a four-server cluster with heavy traffic cannot survive on a 30-server cluster with massive traffic. We’re solving these problems with decentralization. For example, the menu system replacement that is landing in Pressflow 6 generates, caches, and uses menu data locally on each web server. That takes menu operations from one of the biggest cluster bottlenecks (though it is a bit better in Drupal 7) to a component that can scale almost perfectly horizontally. We’re also working on distributed, multi-tier caching strategies that already show a 3–7x increase in cache read performance versus using memcached on localhost. The improvement is even greater versus non-loopback access to other memcached instances.
As with our existing development, improvements in Pressflow will be candidates for future Drupal releases. And, conversely, future improvements for Drupal will be candidates for back-porting to Pressflow.
We also have clients sponsoring work on high-availability measures, including built-in database connection monitoring and failover. While Drupal 7 gains native database replication support, Drupal will continue to require expensive and complex approaches to achieve the same failover capability that Pressflow will have built in.
Q:How soon after Drupal 7’s official release will Pressflow 7 be coming?
A: It shouldn’t be more than a few weeks; porting the missing Pressflow 6 features should not be difficult. We are not beginning Pressflow 7 work until Drupal 7 goes gold.
Q:Where is the best place to download Pressflow?
A: Visit Pressflow.org — there will always be direct-download links from that page. Because large, complex projects use Pressflow, branching using Bazaar (a version-control tool) is a popular way to maintain local changes and apply updates. Project Mercury from Chapter Three integrates a self-updating and configured copy of Pressflow into an Amazon EC2 image (AMI). We’re also working with some higher education institutions to provide their students, staff, and faculty with managed, one-click Pressflow installations, but those won’t launch for a few months.