I recently posted on how I used tactical design strategies to speed up the theme layer of our internal team intranet. We also spent considerable time tuning the queries on each page to push performance on the DB side of Drupal.
The largest source of queries on a page is almost always node_load(). It can push your query count very high, since each node has multiple queries run. That means that there tends to be a nearly linear relationship between the number of nodes on the page and the number of queries. For example, a blog view on a previous version of our intranet would clock in at around 300 queries with 10 nodes.
Introducing litenode
Litenode is a new module that uses Views to replace multiple node_loads() on a page.
The concept is simple: Views already knows how to grab any of the fields in a single go that many modules load independently in their hook_nodeapi(‘load’). Why not have Views grab them all at once (as you would do in a table view, for example) and then have a style plugin map the Views fields to all the places where modules expect them to be on a loaded node?
A regular node load:
1. Node module loads title, teaser, body, revisions, etc.
2. Node module loads user object, picture, etc.
3. Comment module loads comment count, last comment time, etc.
4. Taxonomy module loads taxonomy terms
5. Upload module loads files
6. CCK loads its fields
7. Any other modules you have enabled do their thing
~5-10 queries per node
10 queries x 10 nodes = 100 queries
A litenode load:
1. views loads title, teaser, body, revisions, user name, user picture, uid, status, comment, … (CCK fields) … in a single query for all 10 nodes
2. If enabled, litenode runs a single taxonomy query to grab all taxonomy terms for all 10 nodes
3. If enabled, litenode runs a single query to grab all uploaded files for all 10 nodes
3 queries for all 10 nodes
Some results after testing on a very basic Drupal install:
1341 nodes, no query caching. Three very simple views, 20 nodes on a page, pager, teaser view, no filters. Query execution times in ms of all queries on page (simple garland, no extra blocks):
| Common node teaser list | 108 | 107.45 | 103.16 | 110.74 | 107.9 |
| Litenode | 34.26 | 32.94 | 30.69 | 34.63 | 34.79 |
| Litenode cullened | 42.56 | 43.24 | 46.36 | 45.99 | 48.57 |
Caveats
Of course there are caveats.
Own logic stack separate from nodeapi
First and foremost, litenode has its own logic for assembling nodes — but it is actually quite condensed. You can implement hook_litenode() to map views fields to node properties so that you could, for example, add the $node→og_groups to litenode nodes through Views (I have already added this integration).
Limitations in faking the node object
As noted above, taxonomy and upload don’t quite map the way other views fields do. This is because there is a many to one relationship between terms/nodes and files/nodes. However, once the initial list of nids is retrieved we can easily collect all the terms with a single query rather than querying per node. Similarly, currently only single value CCK fields are rendered correctly. I will have to look into adding support for multivalue fields in the future.
Big sets make for slow joins
The last and most interesting problem is that while in raw speed the litenode query time is much faster than the regular node load view on small sets, on large sets (we have 13,000+ nodes on our intranet) the total query time can become slower. As the set you query increases in size, the several joins that Views makes to retrieve all the litenode fields slow down.
Room for more advanced experiementation
I experimented briefly with query cullening (named after Tim Cullen who first introduced me to this technique). A cullened query is one where the base table has been replaced by a subselect to limit the result set of the query before any joins are performed. You can get significant speed gains on large tables using this technique when the conditions of the subselect reduce the result set considerably. For example, a cullened query against the node table with a WHERE on the node type may reduce the initial set of rows (before joining to other tables) from 13,000 to 1,000 because you only care about blog nodes.
Using the powerful and flexible hooks provided by Views 2 for dynamically rewriting queries (see hook_views_pre_build() and hook_views_query_alter()) we can programmatically move any WHERE and ORDER BY clauses against the base table into a subselect. Much of this code is highly experimental, however, so query cullening is provided as an option in litenode and is disabled by default.
Looking to Drupal 7
For those following some of the latest developments in HEAD, catch did some extensive work to start refactoring node_load() to improve efficiency for multiple node loads, essentially allowing modules to query against a set of nids rather than against a single nid. Until there is a Views-like query builder in core, this will probably be as good as it gets: http://drupal.org/node/324313.
You can learn more about litenode on its project page at code.developmentseed.org