new plugin: acts_as_git
Courtenay : November 14th, 2008
With the help of Jamie van Dyke at Parfait and Scott Chacon at GitHub, I'm pleased to announce Acts As Git (no, I don't like the name either). It's a simple plugin which stores all changes you make to a text field in a git repository. This is ideal for something like a git-backed wiki.
Look at it here: github or check it out from
git://github.com/courtenay/acts_like_git.git
From the README:
ALG automagically saves the history of a given text or string field. It sits over the top of an ActiveRecord model; after a value is committed to the database, the plugin writes the new value to a text file and commits it to a git repository. This way you get all the advantages of using Git as version-control.
Usage:
class Post < ActiveRecord::Base
versioning(:title) do |version|
version.repository = '/home/git/repositories/postal.git'
version.message = lambda { |post| "Committed by #{post.author.name}" }
end
end
To view the complete list of changes:
>> @post = Post.find 15
<Post:15>
>> @post.title
=> 'Freddy'
>> @post.history(:title)
=> ['Joe', 'Frank', 'Freddy]
>> @post.log
=> ['bfec2f69e270d2d02de4e8c7a4eb2bd0f132bdbb', '643deb45c12982dde75ba71657792a2dbdda83e6',
'1ce6c7368219db7698f4acc3417e656510b4138d']
>> @post.revert_to '1ce6c7368219db7698f4acc3417e656510b4138d'
>> @post.title
=> 'Joe'
It uses the excellent Grit library, and doesn't actually have a checked-out repository. The latest version of your data is still stored in the database. You can actually clone this repo and view the changes; pushing back to it won't do anything useful.
Plugin configuration style?
Courtenay : November 10th, 2008
I’m putting the final touches on a super-sweet versioning plugin, and I’ve discovered that we’re using several different metaphors for configuring the plugin options. I’d like to get some opinions/feedback on your preferred style.
The DSL
Using a DSL and passing blocks in which get instance evalled. I’m normally very scathing of DSLs; I think that they’re Yet Another Language for people to learn to use – it’s usually your very own write-only syntax – but it’s been super-fun implementing the backend to this.
class Monkey < ActiveRecord::Base
versioning do
author do
name { user.current.name }
message { "Commited via #{name}" }
end
repository "Joe's DataStore"
end
Hashes
This seems to be the Rails plugin default:
class Monkey < ActiveRecord::Base
versioning :author => { :name => lambda{ |u| user.current.name } }, :repository => "Joe's DataStore"
end
Class vars / methods
Easy to monkeypatch later
class Monkey < ActiveRecord::Base
will_version
@@version_repository = "Joe's DataStory"
def version_author
current_name
end
end
Are there others? Which do you prefer? Currently I’m using all three in this one plugin, and it’s very un-awesome.
Ripping out your mocks
Courtenay : November 6th, 2008
I sat down with David Chelimsky at Rubyconf today to talk about rSpec and an interesting topic came up.
In my mind, there are two reasons to use a mock object: first, when you’re developing TDD style, you physically don’t have the objects yet; and second, so that you can tightly focus your unit tests. Maybe, these two different purposes should use a different mechanism.
His question to me then was, “Do you replace your mocks with the real objects after you’ve implemented those objects?”. I guess I hadn’t thought about that before. Do you? If so, how do you handle the extra complexity, maintaining sane associations and valid data?
On hiring Rubyists and Railsers
Courtenay : November 4th, 2008
We’re launching a new service at work in the next week or so that involves me looking through a lot of job applications: resumes and sample code.
I’d like to tell people right now, upfront, if you’re applying for a Ruby or Rails job, for anyone, there are a few ways of ensuring you get called back. They’re probably fairly simple.
Send some sample code, maybe a link to a project on Github, or a snippet of work you’ve done. Make sure you send the tests for the code. Any tests would be good, and you get bonus points for good tests. If you don’t have any tests, write them.
Don’t worry too much about sending some crazy complex code. Maybe some polymorphic associations (models), some ajax (views), a knowledge of the whole stack (simple controllers), some nested resources. Write a simple todo list application.
It’s not just a silly philosophy. Writing tests – hell, submitting tests with your job application’s code – shows that you’ve actually thought about the code, and that it actually works. You’ve permutated and permeated through the logic, actually think about the various ramifications of the design decisions in the code itself.
Just the pure act of sending tests with your sample code will put you above 90% of applicants, I promise.
We've stopped using rSpec ...
Courtenay : November 3rd, 2008
...for new projects.
![]()
We upgraded the gems for one of our client projects, and the auto-loading / config.gems managed to completely break all our other projects, requiring upgrades, which caused weird breakages in weird places in some of the specs.
The app would refuse to deploy (rake tmp:create failed, because lib/tasks/rspec.rake was being loaded, and spec wasn't installed on the server). The annoying thing was that just having whatever.11 installed (I don't know the exact version) broke older apps on whatever.4 or whatever.0.2. .. so those had to be upgraded too. We wasted a day or two (three, maybe four developers) which equates to several thousand dollars in wasteage. It was also really infuriating -- the culmination of a few years of frustration of rSpec's weirdnesses.
After that, I found that some of the specs had never run (who knows why). It stopped reading spec.opts and started doing some weirdness with pending options. Finally, Rick just snapped, threw out rSpec and his Model Stubbing library, and now we're playing with a combination of rr, context, and matchy, trying to get a feel for a decent workflow again. It's sad and maybe a bit exciting to be on the edge.
What are you testing with?
A simple Rails slow-query logger
Courtenay : September 29th, 2008
A few years ago I wrote a simple addition to ActiveRecord that does two things: it chops out the eager loading "t1_t2 AS foo", and it shows the number of records returned for every query you run against the database. You can view the file here
Today I was profiling a site and wanted to quickly find the slow database queries, but didn't have access to mysql's config directly, so I patched that file above to record all queries over 500ms and save it to a log file. I'll warn you now, it ain't pretty, but it works pretty well.
Here's how it works: First, throw this in a file in config/initializers. I open up the rails abstract adapter
module ActiveRecord
module ConnectionAdapters
class AbstractAdapter
And add in a new logger.
def slow_query; 0.5; end # number of seconds
def slow_query_logger
@slow_query_logger ||= Logger.new("log/slow_queries.log")
end
Ideally of course this would all be configurable.
Next, I copy the logging code out of the latest ActiveRecord, and patch it to return the number of records. This is a bit of a hack, too, but we can either look at "num_rows" from the resultset or the actual size of an array.
s = result && (result.respond_to?(:num_rows) ? result.num_rows : \
(result.respond_to?(:size) ? result.size : 0)) || 0
Finally, I rewrite the actual log method so that it checks the benchmark against our threshold
def log_info(sql, name, runtime, result_size = 0)
if runtime > slow_query && slow_query_logger
slow_query_logger.debug "Slow query: (#{runtime}) [#{result_size}] #{sql}"
end
And add the number of results to the regular rails log, while snipping out the annoying eager-loading code.
if @logger && @logger.debug?
if name =~ /Load Including Associations$/
sql = sql.scan(/SELECT /).to_s + ' ...<snip>... ' + sql.scan(/(FROM .*)$/).to_s
end
name = "#{name.nil? ? "SQL" : name} (#{sprintf("%f", runtime)}) [#{result_size.to_i}]"
@logger.debug format_log_entry(name, sql.squeeze(' '))
end
end
Here's the full file.
module ActiveRecord
module ConnectionAdapters # :nodoc:
class AbstractAdapter
protected
# todo: config this
def slow_query; 0.5; end
def slow_query_logger
@slow_query_logger ||= Logger.new("log/slow_queries.log")
end
alias_method :old_log, :log
def log(sql, name, &block)
if block_given?
#if @logger and @logger.level <= Logger::INFO
result = nil
seconds = Benchmark.realtime { result = yield }
@runtime += seconds
s = result && (result.respond_to?(:num_rows) ? result.num_rows : \
(result.respond_to?(:size) ? result.size : 0)) || 0
log_info(sql, name, seconds, s)
return result
#end
else
log_info(sql, name, 0, 0)
nil
end
# old_log(sql, name) { yield }
rescue Exception => e
@last_verification = 0
message = "#{e.class.name}: #{e.message}: #{sql}"
log_info(message, name, 0)
raise ActiveRecord::StatementInvalid, message
end
alias_method :old_log_info, :log_info
def log_info(sql, name, runtime, result_size = 0)
if runtime > slow_query && slow_query_logger
slow_query_logger.debug "Slow query: (#{runtime}) [#{result_size}] #{sql}"
end
if @logger && @logger.debug?
if name =~ /Load Including Associations$/
sql = sql.scan(/SELECT /).to_s + ' ...<snip>... ' + sql.scan(/(FROM .*)$/).to_s
end
name = "#{name.nil? ? "SQL" : name} (#{sprintf("%f", runtime)}) [#{result_size.to_i}]"
@logger.debug format_log_entry(name, sql.squeeze(' '))
end
end
end
end
end
Would this work as a plugin? As a patch to Rails itself? Or did somebody else already implement a cross-platform slow query logger?
The awesomest filter and sort ever
Courtenay : August 26th, 2008
Update 2: seems like only one or two people knew about what can_search does :) I hope we’re all a little better educated.
Update: yes, I’m using these named scopes throughout the app in other places – they aren’t used only in this one controller.
Often you have an index action where you want to sort records, filter by a parameter, and maybe join on some other tables to get a result.
Let’s say you’re looking at a videos controller (where videos are acts_as_taggable) and you want to filter by user_id, filter by tag name, order by video title, or rating.
Maybe later, you’ll add a roles (hm:t) association and need to only show videos viewable by a certain user. How complex!
To solve this, we’re going to play with some things you may know, and finish up with a bam! pow! that’ll take your breath away.
Rather than build up some form of frankenquery with all sorts of conditionals and cases, joins, and other messing about, let’s use a brand-new bleeding edge feature of Rails: named scopes.
First, build up individual named scopes for each axis on which you wish to filter. Make sure and put the table name in that query.
named_scope :by_user, lambda { |user_id|
{ :conditions => ['videos.user_id = ?', user_id] }
}
named_scope :tag_name, lambda { |tag_name|
{ :joins => { :taggable => :tag },
{ :conditions => ['tags.name = ?', tag] }
}
named_scope :rating, lambda { |rating|
{ :conditions => ['ratings_count > ?', rating] }
}
OK, I cheated on the last one, but let’s assume you have a counter_cache on ratings count.
Now, if you have more than one scope with joins in it, you’ll need to apply this patch to your rails installation, or upgrade past 2.1.1. This will allow you to have as many joins as you like in your scopes.
Now, here’s where the magic happens: in the controller. Big shout out to protocool for this method.
Let’s build up a set of all the possible scopes that we might want to use, in an array form like [ named_scope, argument ]
def index
scopes = []
scopes << [ :by_user, params[:user_id] ] if params[:user_id]
scopes << [ :tag_name, params[:tag_name] ] if params[:tag_name]
scopes << [ :rating, params[:rating] ] if params[:rating]
end
Easy, right? Very readable.
How about some ordering?
order = { 'name' : 'videos.name ASC' }[params[:order]] || 'videos.id DESC'
Now, as you know, you can chain named scopes. So you could say Video.by_user(2).tag_name('monkeys') Let's take advantage of this, building up a chain of scopes dynamically using 'inject', starting from Video, and adding each scope we added to the array above. This is really fun magic, because it doesn't run any of the queries until the whole thing is built. I don't even know how this works, but it does. Swimmingly.
@videos = scopes.inject(Video) {|m,v| m.scopes[v[0]].call(m, v[1]) }.paginate(:all, :order => order)
The final method looks like this:
def index
scopes = []
scopes << [ :by_user, params[:user_id] ] if params[:user_id]
scopes << [ :tag_name, params[:tag_name] ] if params[:tag_name]
scopes << [ :rating, params[:rating] ] if params[:rating]
order = { 'name' : 'videos.name ASC' }[params[:order]] || 'videos.id DESC'
@videos = scopes.inject(Video) {|m,v| m.scopes[v[0]].call(m, v[1]) }.paginate(:all, :order => order, :page => params[:page])
end
One final caveat. Sometimes :joins doesn’t know where to get the video id from, so if you’re using id in your app, you’ll need a slight workaround involving manually getting the pagination count, and forcing :select => ‘distinct videos.*’ in the paginate call.
If this works for you, it’s really easy to add new filtering, ordering, or even scoping to your query. For example, you can add some form of role hackery to your video
named_scope :viewable_by, lambda { |user|
{ :joins => { :permissions => :roles },
:conditions => [ "roles.user_id = ? AND permissions.role = ?", user.id, "view"
}
Controller, you replace the first scope definition with this
scopes = [ :viewable_by, current_user ]
Or, you modify the scope inject statement
@videos = scopes.inject(Video.viewable_by(current_user)) { |m,v| ... }
If you consider this a giant hack, you’re probably at least partly right. However, the alternative in building up a complex query with many possible moving parts is just hideous. And consider this: you can unit test each part of the query on its own, in the model specs.
Sanitize your users' HTML input
Courtenay : August 25th, 2008
The default Rails sanitize helper is actually quite powerful. You can see some of its usage here:
<%= sanitize @article.body, :tags => %w(table tr td), :attributes => %w(id class style) %>
However, as the docs say,
Please note that sanitizing user-provided text does not
guarantee that the resulting markup is valid.
We were having an issue with users providing bad markup and leaving their tags unclosed.
This is <a href="http://foo.com">my dog<a/> and he’s super cool!
We solved it by running Hpricot over their input.
before_save :clean_html
def clean_html
self.body = Hpricot(body).to_html
end
For performance reasons, you should probably run the hpricot and sanitize methods on the way into the database, rather than rendering it in the views, because it’s somewhat slow, and is a calculation that you only need to perform once.
In fact, instead of saving it in a callback, you could overload the accessor like so:
def body=(new_body)
write_attribute :body, Hpricot(new_body).to_html
end
You’ll want to include the ActionView methods from ActionView::Helpers::SanitizeHelper to get ‘sanitize’ available in your model.
data migration tip
Courtenay : August 20th, 2008
I’m tracking all the failures that occur in a model, so the users can easily track and resolve them. The data looks something like this:
Failures table
id | video_id | description
----+----------+------------------------
1 | 5 | Transcoding error
2 | 23 | Bad file type
Videos table
id | name | creator_id
----+----------+------------------------
5 | Kitten | 23
6 | Monkey | 12
23 | Elephant | 23
If we want to search for all failures by creator, we have to do a join on Failures and Video. To make this a little faster, I will denormalize the data a little, by adding a creator_id to failures table, and a callback to the Failure model to set the creator_id field. This is one of the scaling tradeoffs you need to make: slower writes, slower updates, larger table disk size, faster reads and counts (with grouping).
class Failure < ActiveRecord::Base
before_update :denormalize_creator
def denormalize_creator
self.creator_id = video && video.creator_id
end
end
This might have some issues depending on if you’re using #build to generate your Failure object. Regardless..
The temptation (for me, anyways) is to create a migration that looks something like this:
class AddCreatorIdToFailure ActiveRecord::Migration
def self.up
add_column :failures, :creator_id, :integer
Failure.each do |fail|
fail.update_attribute :creator_id, fail.video.creator_id
end
end
def self.down
remove_column :failures, :creator_id
end
end
There are a few things bad with this method.
1. You’re loading all failure objects into memory, then performing a query on each one. 2. If you have thousands of failures, it’s going to take some time to run. If it gets stopped partway through, you’ll have to comment out that “add_column” line to get it to re-run.
So. Step one, move the update to its own migration. Then, you can re-run it as often as you like.
Step two, make the migration a bit smarter. You can do this either by rewriting it in SQL, or by using something like paginated_each (jfgi).
When you do that, it’s worth throwing some conditions and an include in there. For example,
Failure.paginated_each(:order => "id desc", :conditions => "creator_id IS NULL", :include => :video) do |fail|
fail.update_attribute :creator_id, fail.video.creator_id
end
You can run this migration as many times as you like (it will only query the records it hasn’t updated). Ultimately, though, unless you’re doing polymorphic associations (which makes the join nigh on impossible), it’s going to be 10 – 100x faster (wild guess) doing the update in raw sql. Any takers on the best SQL for this situation?
Since I don't have anything of value to post,
Courtenay : August 15th, 2008
Here’s a video of my cat.
Authenticate like SSO with ActiveResource
Courtenay : July 18th, 2008
When you have multiple Rails applications that don’t share a common database and you want to share the user authentication information – or rather, use one app to provide authentication for another – there are a few options. Here’s how I solved it recently. This is the simplest way I could think of to get this working. I couldn’t find a plugin to do this, so here’s the result of my pdi.
Effectively what we’re doing is separating the user’s data - their profile info, if you like - from the credentials, and moving the latter to ActiveResource. This is something you should do in your own apps. Too frequently we stuff a bunch of data (like full name, phone number) into the user model, because it’s there. A more advanced version of this code might use the ‘profile’ as the resource name, updating the local profile with data from remote, and keeping User as a pure credential model.
Let’s assume we have App A which will act as the authenticator master. Our other application, App B, will still hold a User record, but we’ll override the authenticate method to use ActiveResource. We’ll also store some other fields like username and email, and will grab those each time the user logs in. That way, they can set an auth token in App A and they can login from cookies in app B (provided the cookie domain is shared).
class User < ActiveRecord::Base
class Auth < ActiveResource::Base
self.site = "http://app-a.com"
self.format = :json
self.element_name = 'user' # this is the name of the resource in your app
end
def self.authenticate(login, password)
Auth.user = login
Auth.password = password
# Authenticating against the app will actually 'prove' the login/pass details.
# We also want the user's details so we can cache them here.
authed = Auth.find :first, :params => { :login => login }
return false unless authed
# Now, pull the data from remote and store it locally.
user = User.find_or_initialize_by_login(login)
user.attributes = authed.attributes
user.save!
user.activate!
user
rescue ActiveResource::ClientError # 406 error -- bad username/password.
false
end
Interestingly enough, find first actually runs the ‘index’ action, and returns the first record. sigh
Now, in your App A: users_controller, you want to set up a filter in the index like so:
def index
if params[:login]
# for single-sign-on.
@users = User.find(:all, :conditions => { :login => params[:login] })
else
@users = User.paginate(:all, :page => params[:page]) #...
end
respond_to do |format|
format.html
format.json { render :json => @users }
end
end
Do you have a better way of doing this?
All quiet on the Western Front
Courtenay : June 30th, 2008
It’s been a while since I blogged here. Mainly, I think it’s because as I get deep into the daily grind of building other people’s social apps, I no longer feel like any of the code I’m writing is worthy of a post. This is not to say that I don’t love my work, just that maybe the techniques we’re using aren’t that special. (Maybe I’m wrong. Lots of people at Rails Conf came up to me and said they love reading this blog.)
So, readers, what content do you want to see on the Caboose blog moving forward?
Still got rooms at the Jupiter for conference
Courtenay : May 16th, 2008
If you want a discounted room, contact me today, preferably on irc.
Coming to caboose conf? You need to register
Courtenay : May 6th, 2008
Your name needs to be on the list, otherwise you won’t get in.
Also, you’ll need to prove your worth, by submitting a documentation patch to Rails core. Do that, then sign up here: http://register.caboose.org.
Moving everything to github
Courtenay : May 3rd, 2008
This is a quick note to inform you that any plugins or code hosted on *.caboo.se will be moving to GitHub very soon. If I’m hosting your project, you have about a week to move your code repository, if you haven’t already.