the fabulous spider fuzz plugin

Courtenay : February 20th, 2007

New plugin!

script/plugin install svn://caboo.se/plugins/court3nay/spider_test

SpiderTest

SpiderTest is an automated integration-testing script that iterates over every page in your application. It performs a few valuable tasks for you:

  • parses the html of every page, so if you have invalid html, you will be warned.
  • finds every link to within your site and follows it, whether static or dynamic.
  • finds every Ajax.Updater link and follows it.
  • finds every form and tries to submit it, filling in values where possible.

This is helpful in determining:

  • missing static pages (.html)
  • bad routing
  • poor code coverage - forgot to test a file? Don't wait for a user to find it.
  • simple fuzzing of form values.
  • automated testing of form paths. Often we have forms which point to incorrect locations, and up until now this has been impossible to test in an automated fashion or without being strongly coupled to your code.

USAGE

$ script/plugin install spider_tester
$ script/generate integration_test spider_test

Load up the test/integration/spider_test.rb and make it look something like this, replacing your own implementation details where appropriate. You'll probably want to load all of your fixtures.

require "#{File.dirname(__FILE__)}/../test_helper"

class SpiderTest < ActionController::IntegrationTest
  fixtures :users, :roles, :images, :categories
  include SpiderIntegrator

  def test_spider
    get '/'
    assert_response :success

    spider(@response.body, '/')
  end

end

If you require a login for your app, you'll need to specifically log in. I do it like:

require "#{File.dirname(__FILE__)}/../test_helper"

class SpiderTest < ActionController::IntegrationTest
  fixtures :users, :roles, :images, :categories
  include SpiderIntegrator

  def test_spider
    get '/sessions/new'
    assert_response :success
    post '/sessions/create', :login => 'admin', :password => 'test'
    assert session[:user]
    assert_response :redirect
    assert_redirected_to '/'
    follow_redirect!

    spider(@response.body, '/')
  end

end

Todo:

  • better, aka more random, fuzzing. currently, I check the fieldname and change the data types accordingly. It'd be good to have some more advanced algorithm in here

  • specify which actions to ignore you can modify this by editing the plugin and setting @visitedurls and @visitedforms but this should be more easily settable.

  • use hpricot instead of html::document no clue, really, but I hear hpricot is faster.

  • better capturing of errors instead of dying each time there's an error, store them all up so the user can fix 'em all at once.

*image from tj:fluffy:online @ flickr

12 Responses to “the fabulous spider fuzz plugin”

  1. ted Says:

    Awesome!

  2. jerrett Says:

    I second ted’s comment, Awesome!

  3. James H Says:

    Hot business! Thank you for this wonderful plugin.

  4. marcus Says:

    Maybe it’s the way I have plugins set up, but I needed to

    include Caboose::SpiderIntegrator

    rather than

    include SpiderIntegrator

    to make it work. Excellent plugin though!

  5. Mark Van Holstyn Says:

    Awesome work! In addition to the fix marcus suggested, I also had to update lib/cabose.rb to make Caboose a Module rather than a Class. I also updated piece of code which checks for static files. Instead of checking the file extension, I just check if the file exists… if it doesn’t then I get request. Thanks again!

    Thanks!

  6. court3nay Says:

    please send me patches!! court3nay gmail

  7. evan Says:

    Nice. I wrote a similar thing about 3 weeks ago but didn’t have time to release it. Emphasis for that was load thrashing a staging server, rather than checking XHTML/link validity, but they have a lot in common. Let me know if you want a glance at the code.

  8. gyver Says:

    As my project gives some links to non-HTML content to download, I had to add:

    return unless html =~ /^<\?xml/

    at the beginning of the consume_page method (my layouts always begin with <?xml…) to avoid early crashes when trying to download these links. Maybe there would be a way to get at the mime-type and filter on text/html and text/xml ?

  9. Cameron Booth Says:

    Courtenay, this is super useful, thanks for puttting it together!

    Like Markus said above, I also had to “include Caboose::SpiderIntegrator”, with the Caboose:: portion.

    In addition, I was getting validation errors but it was hard to tell where they were coming from, so I uncommented line 26 in spider_integrator.rb, the “puts” statement. There is probably a better way to do this, raising the URI in the error message perhaps?

  10. Matte Says:

    Thanks for this very very useful plugin… I find out the same problem with xml link! So thanks also to gyver for the nice tip!

  11. Daniel Lucraft Says:

    Looks cool, but I had a namespacing problem since I already have a model called Link.

  12. Reza Says:

    Understanding that this will be used by developer only, isn’t it better to be released as gem instead?

Sorry, comments are closed for this article.