Development Seed

Blog

Litenode

Say Goodbye to Slow Node Loads and Hundreds of Queries

I recently posted on how I used tactical design strategies to speed up the theme layer of our internal team intranet. We also spent considerable time tuning the queries on each page to push performance on the DB side of Drupal.

The largest source of queries on a page is almost always node_load(). It can push your query count very high, since each node has multiple queries run. That means that there tends to be a nearly linear relationship between the number of nodes on the page and the number of queries. For example, a blog view on a previous version of our intranet would clock in at around 300 queries with 10 nodes.

Introducing litenode

Litenode is a new module that uses Views to replace multiple node_loads() on a page.

The concept is simple: Views already knows how to grab any of the fields in a single go that many modules load independently in their hook_nodeapi(‘load’). Why not have Views grab them all at once (as you would do in a table view, for example) and then have a style plugin map the Views fields to all the places where modules expect them to be on a loaded node?

A regular node load: 1. Node module loads title, teaser, body, revisions, etc. 2. Node module loads user object, picture, etc. 3. Comment module loads comment count, last comment time, etc. 4. Taxonomy module loads taxonomy terms 5. Upload module loads files 6. CCK loads its fields 7. Any other modules you have enabled do their thing

~5-10 queries per node 10 queries x 10 nodes = 100 queries

A litenode load: 1. views loads title, teaser, body, revisions, user name, user picture, uid, status, comment, ... (CCK fields) ... in a single query for all 10 nodes 2. If enabled, litenode runs a single taxonomy query to grab all taxonomy terms for all 10 nodes 3. If enabled, litenode runs a single query to grab all uploaded files for all 10 nodes

3 queries for all 10 nodes

Some results after testing on a very basic Drupal install:

1341 nodes, no query caching. Three very simple views, 20 nodes on a page, pager, teaser view, no filters. Query execution times in ms of all queries on page (simple garland, no extra blocks):

Common node teaser list 108 107.45 103.16 110.74 107.9 Litenode 34.26 32.94 30.69 34.63 34.79 Litenode cullened 42.56 43.24 46.36 45.99 48.57

Caveats

Of course there are caveats.

Own logic stack separate from nodeapi

First and foremost, litenode has its own logic for assembling nodes — but it is actually quite condensed. You can implement hook_litenode() to map views fields to node properties so that you could, for example, add the $node->og_groups to litenode nodes through Views (I have already added this integration).

Limitations in faking the node object

As noted above, taxonomy and upload don’t quite map the way other views fields do. This is because there is a many to one relationship between terms/nodes and files/nodes. However, once the initial list of nids is retrieved we can easily collect all the terms with a single query rather than querying per node. Similarly, currently only single value CCK fields are rendered correctly. I will have to look into adding support for multivalue fields in the future.

Big sets make for slow joins

The last and most interesting problem is that while in raw speed the litenode query time is much faster than the regular node load view on small sets, on large sets (we have 13,000+ nodes on our intranet) the total query time can become slower. As the set you query increases in size, the several joins that Views makes to retrieve all the litenode fields slow down.

Room for more advanced experiementation

I experimented briefly with query cullening (named after Tim Cullen who first introduced me to this technique). A cullened query is one where the base table has been replaced by a subselect to limit the result set of the query before any joins are performed. You can get significant speed gains on large tables using this technique when the conditions of the subselect reduce the result set considerably. For example, a cullened query against the node table with a WHERE on the node type may reduce the initial set of rows (before joining to other tables) from 13,000 to 1,000 because you only care about blog nodes.

Using the powerful and flexible hooks provided by Views 2 for dynamically rewriting queries (see hook_views_pre_build() and hook_views_query_alter()) we can programmatically move any WHERE and ORDER BY clauses against the base table into a subselect. Much of this code is highly experimental, however, so query cullening is provided as an option in litenode and is disabled by default.

Looking to Drupal 7

For those following some of the latest developments in HEAD, catch did some extensive work to start refactoring node_load() to improve efficiency for multiple node loads, essentially allowing modules to query against a set of nids rather than against a single nid. Until there is a Views-like query builder in core, this will probably be as good as it gets: http://drupal.org/node/324313.

You can learn more about litenode on its project page at code.developmentseed.org

Very good, thanks.

Very good, thanks.

The “cullened” bar on the

The “cullened” bar on the graph seems to be longer than the non-cullened one…

Assuming I’m understanding it correctly, it does sound like an anti-pattern, though. Are you saying instead of…

SELECT * FROM {node} n LEFT JOIN {something} s ON n.nid = s.nid WHERE f.type = 'page'

…we should do…

SELECT * FROM (SELECT * FROM {node} WHERE type = 'page') f LEFT JOIN {something} s ON n.nid = s.nid

…? Surely in the first query the WHERE filtering happens before the joins do, whereas the second requires two queries… Heck, just run an EXPLAIN on two queries like those; the results of the first are a lot prettier than that of the second.

Some clarifications

The largest performance gains happen when you can achieve the following things from your derived table:

1. Significant reduction of rows from the early application of the WHERE clause. You won’t see real gains here until you can reduce the rows in the derived table by a factor of 1000.

2. A reduction in columns returned in the derived table.

In both situations you’re generating a smaller memory footprint.

This technique also will yield larger gains when the initial query requires any of the following:

1. LEFT JOINs (which by nature create larger tables before the where clause is applied)

2. ORDER BYs (using a derived table allows mysql to re-order a much smaller table)

I’m not sure what query Young used in his benchmarking or whether he used SQL_NO_CACHE. I’m also not sure what mysql memory allocation configurations he was using.

Bottom line: this technique is not a panacea, but there are situations where it can yield some significant gains. As always, test, test and re-test.

great stuff

This is great stuff, guys. I will definitely keep this in mind for future performance tuning needs.

Credit where credit is due

While it’s very flattering to have a technique named after me, I would be remiss if I didn’t point out the excellent MySQL Performance Blog which has always been a great source for learning new tricks.

I discovered the particular technique used above while reading this Using delayed JOIN to optimize count(*) and LIMIT queries and the thoughtful comments that follow. The post itself talks about how the technique benefits COUNT and LIMIT queries (especially those with high offsets), but the principle of using well-targeted derived tables has broader applicability to any situation where JOINs are used on large tables.

This sounds very

This sounds very interesting.

Also, in Drupal 7 it should be possible to add additional fields or JOINs to node_load and node_load_multiple via hook_query_alter. So D7, with hook_query_alter, you should be able to achieve what you did with litenode without having to fake the node loading and object assembling process. This should be possible as of current HEAD I think. A query tag in node_load_multiple would be a nice and straightforward addition to make the process easier.