some easy optimizations; or, reading the rails source for fun and profit
Courtenay : November 18th, 2006
Here are a few 'rails' ways to optimize some of the queries in your application. Remember: optimize last!
Say you have a topic containing many posts, or a category with many articles, or a user with many orders,
and you want to be able to show a summary something like this:
" just evals the string. If you were to hand-code this, it'd look something like:
| topic | last post | number of posts |
|------------------|-------------|-----------------|
| Monkeys are cool | #12 by Joe | 25 |
| I like bananas | #33 by Fred | 8 |
|------------------|-------------|-----------------|
The data model for this schema looks something like
class Topic < ActiveRecord::Base
has_many :posts
end
The above table is going to be relatively expensive in terms of database queries, since we need to perform a number of actions
# load all the topics
# load the posts, find the last one
# count the posts
Luckily there are a number of optimizations, some built into rails.
h2. the counter cache
You're probably familiar with the counter cache, but here's a refresher.
First, add a 'posts_count' integer field to Topic; then in the post model, we set up
# post.rb
class Post < ActiveRecord::Base
belongs_to :topic, :counter_cache => true
end
Counter cache is quite interesting, in that it saves the number (count) of associated records so you don't have to look it up.
Taking a look at the source to 'belongs_to' reveals some interesting metaprogramming. I've added some comments inline.
# File vendor/rails/activerecord/lib/active_record/associations.rb, line 707
707: if options[:counter_cache]
# Sets up the count column. You can specify your own, or have it automagic.
708: cache_column = options[:counter_cache] == true ?
709: "#{self.to_s.underscore.pluralize}_count" :
710: options[:counter_cache]
711:
# Create an after_create filter on the model that automatically increments the counter on the other table
712: module_eval(
713: "after_create '#{reflection.name}.class.increment_counter(\"#{cache_column}\", #{reflection.primary_key_name})" +
714: " unless #{reflection.name}.nil?'"
715: )
716:
# Set up a before_destroy filter so that we decrement the associated counter, too.
717: module_eval(
718: "before_destroy '#{reflection.name}.class.decrement_counter(\"#{cache_column}\", #{reflection.primary_key_name})" +
719: " unless #{reflection.name}.nil?'"
720: )
721: end
Remember, calling "before_destroy
class Post < ActiveRecord::Base
belongs_to :topic
after_create :increment_topic_counter
def after_create
Topic.increment_counter('topic_count', topic_id)
end
end
For kicks, look up the source to increment_counter and you'll find it does a quick update_all, which is my favorite way of executing some fast-running SQL.
# File vendor/rails/activerecord/lib/active_record/base.rb, line 525
525: def increment_counter(counter_name, id)
526: update_all "#{counter_name} = #{counter_name} + 1", "#{primary_key} = #{quote_value(id)}"
527: end
So, what does this mean? Firstly, the after_create / before_destroy callbacks are a powerful way to implement association-based caching. This is particularly useful when you are read-optimizing (more reads, less writes). It slows down the write process (one extra query on inserts) but speeds up the reads.
Secondly, metaprogramming with module_eval is a good way to hide common functionality.
h2. extending the counter cache idea
To retrieve the last post in a topic, as in the table above, there are a number of methods.
# Eager loading.
Topic.find(:all, :include => :posts)
This will load all of the post objects as well as topics. This is fine for a small forum, but definitely won't scale. You could write some custom sql, eager loading just the last post, but that will be difficult to maintain (who wants to get their mitts on someone else's sql!?)
# Load the posts and group-by.
Post.find(:all, :group => 'topic_id', :order => 'created_at desc', :include => 'post')
This is a fun query, since it'll probably only work on mysql, (hah! eat it postgres!) -- we're flipping around the semantics of the query, performing a find on posts instead of topics, with group-by on the foreign key, and eager loading. Empty topics won't be loaded. If you want to load topics too, you can start joining. (yuk, again)
# Extend counter cache
Add a 'last_post_id' to the topic model, and ..
class Post < ActiveRecord::Base
belongs_to :topic, :counter_cache => true
def after_create
topic.update_attribute(:last_post_id, id)
end
end
class Topic < ActiveRecord::Base
has_many :posts
has_one :last_post
end
Now when you Topic.find(:all, :include => :last_post) and you'll have that last post without any extra queries.
Note: before_destroy requires some trickiness, because you'll need to find the most recent post before the current one; and you'll also need to check whether the current post is the last one attached to the topic. acts_as_list is a good candidate here.
Sounds like a good idea for a plugin which extends has_many/belongs_to!
5 Responses to “some easy optimizations; or, reading the rails source for fun and profit”
Sorry, comments are closed for this article.
November 18th, 2006 at 12:40 PM
It’s a nice concept, but it does start to break down rapidly.
Consider the basic moderator action of splitting a handful of posts from an existing topic into a new topic, or just ‘hiding’ a post so other moderators can review it before deciding on the deletion.
At that point, you’re doing so many things manually with the last post id, that you’re not really gaining anything from using the counter_cache.
November 18th, 2006 at 01:59 PM
who moves individual posts to a different topic?!
if you have a staging pipeline you’d modify the custom counter_cache to increment/decrement only if the post is ‘public’; alternatively specify has_many :public_posts’ with :conditions
November 18th, 2006 at 05:27 PM
We turned counter caching off in Typo because we were getting killed by complexity as a result of it (and yes, the idea of Typo getting killed by complexity is ironic, thank you for pointing that out). I can’t remember the exact issues, but how things behave in the face of published/unpublished distinctions spring to mind.
November 21st, 2006 at 07:51 PM
I always like reading the Caboose posts. But is there any way you could make them more printable?
April 27th, 2007 at 02:33 PM
Great tips for newbies, thanks…