Gray Soft / Rubies in the Rough / Dreamy Testing (Part 1)

21

NOV
2011

Dreamy Testing (Part 1)

This post is part of a series.

I want to take a swing at one last rule before I wrap up this Breaking All of the Rules miniseries, at least for now. I'm not the type of guy to come out full on against many things and I won't do that here. But there is one rule I think is on pretty shaky ground for how often I hear it thrown about. Let's analyze it and break it.

Don't Reinvent the Wheel

It should be pretty thoroughly drilled into most programmer's minds that we don't want to waste our time reinventing wheels. Well, let's try to find the why behind that before we accept it as law.

First, what's the not-so-hidden assumption this time? It's that we are wasting our time. If we aren't, should the rule still hold?

As always, there are good reasons that this rule exists. Here are a couple I feel are worth honoring:

When you are in the middle of a job and you figure out that you need something, it's usually a much better idea to go with an existing, ready-to-use solution. It would take you time to rebuild it and your version isn't likely to be as robust (just due to it being newer).
If there's an existing solution that is 90% of what you need, it's probably better to contribute the other 10% than to separately build a new 100% solution. Contributing should be faster for you and help others in return.

Those are great arguments. But look at how many arguments I can throw against this rule just off the top of my head:

It's rare, but an existing solution may not exist
Even if it does, it could suck (Perhaps just for your needs.)
Or be abandoned (This generally means it's falling into a less robust state.)
Contributions may not be welcome (Maybe it's not open source.)
You may think you can do it better (Smaller, more efficient, etc.)

That last one is particularly interesting, because, in that case, I think a fork is generally preferable. I said you think you can do it better, but if you are anything like me, you will be wrong more often than you are right. It's usually when I try to replace something that I gain a real appreciation for all the things it was handling that I just didn't understand yet. This may eventually lead me to throw my attempt away.

Even if I do succeed, my solution is likely to be quite different. I mean, that was the point of the exercise. If so, both projects can coexist going forward until users figure out what they prefer. Or perhaps both approaches will retain their charms.

This is dancing around the real reason we rewrite code:

To learn something

I mean, "Duh!" Right? Look at how we teach programming: writing temperature converters all the way up to doing code katas or quizzes. We're not writing this stuff because the world is low on temperature conversion tools. It's what we are pushing into our brain that counts, not our text editor.

Obviously, reuse can be great, but I encourage you to remain highly suspect of this rule in many scenarios. What you learn is king. Don't sacrifice that!

Here Be Dragons!

Let's prove it. I want to break this rule as spectacularly as I can manage. I'm going to purposefully choose a seemingly bad wheel to reinvent. It's a new experiment that I've never tried. This has fail written all over it. The goal is to learn something and hopefully share some ideas with you as I flail about.

We Don't Need Another One

I figured the two most over invented projects in the Ruby community had to be:

Command-line option parsers
Testing frameworks

It just so happens that I've always wanted to write a testing framework! Why? It's that I think I can do it so much better than anyone else. (Relax folks, that was a joke! I have no such delusions.) In truth, I just want to experiment with some ideas to see how they work out.

There are two things that bug me about typical testing frameworks. First, think about about how many references to testing there are in a minimal scenario. Let's say I have this file in my project as test/number_test.rb:

require "minitest/autorun"

class NumberTest < MiniTest::Unit::TestCase
  def test_42
    assert_equal(42, 40 + 2)
  end
end

How many testing concepts are here? Well, we usually stick these files in a test/ or spec/ directory. One. Even though they are there, we generally tack _test or _spec onto the ends of their names. That's two before we wrote any code! Then we require a testing framework—three—inherit from a test case—four—write methods with the magic test_ prefix (feel free to describe() how specs are superior in this context() and I'll buy it())—five. Wow, that's a lot of code that's not about what I am doing here! This is starting to feel like Java's main() definition when I just want to print "Hello World!".

In fact, a desire for simpler testing affects me on many levels. The second thing I would like to focus on is how assertions work. I used test/unit for a long time, so I have assert(), assert_equal(), assert_match(), assert_nil(), assert_raise() (Oops, I had to glance at the docs just now to see if that needed an s on it.), and the surprisingly useful assert_in_delta() burned into my brain. I know a few others, all the way down to the horrible interface that is assert_operator(). (Don't go there! Just use assert().) I know I can stick not_ in the middle of most of those to negate the check, though it's no_ for assert_match() and assert_raise() goes all the way to assert_nothing_raised() (Crap, another peek at the documentation!). MiniTest improves this a bit, by replacing assert_ with refute_. But, without looking, do I give the expected value and then the actual or the actual and then the expected?

Let's face it, that's an interface train wreck.

Is RSpec better here? Probably a little, but I don't like the look of:

some_method.should == 42

Ruby doesn't like it either, if you have warnings turned on, and you really should. (RSpec 2 adds eq() for this reason.)

Also, I hate having to explain to my students that this method they just wrote:

class MyObject
  def handled?
    # return true or false here
  end
end

Is tested with:

my_object.should be_handled

I actually had a student ask me, "Why can't I just call the method I wrote?" Excellent question. I would like to know that too.

So, those are the issues that rob me of sleep at night. Let's see how hard it would be to fix some of them. We may even learn that there are good reasons to keep some of them.

Starting a Ruby Library

My process for beginning a new Ruby library is always the same. Throw down some needed directories, add a README file explaining what I'm doing (I think Readme Driven Development is a great idea), and start developing. The first two steps go like this:

$ mkdir -p ok/{bin,lib,test}
$ cd ok
$ ruby -e 'puts "# OK\n\nJEG2''s ideal testing framework."' > README.md
$ ls
README.md bin       lib       test

Yes, that's a pitiful attempt at a README, but that's not our focus here. The reasoning written above might have made a great start to the README, realistically.

Note that I wrote the README in Markdown. I use Markdown for everything, including these articles you are reading. It's human readable, it's trivial to convert into HTML for simple formatting, and GitHub understands it. I encourage everyone to learn Markdown. It's so worth it.

My Friend, Spike

I realize I made a test/ directory in the setup. Ordinarily I would start test driving some code at this point and I promise to do plenty of articles on that in the future.

However, this code is a quick spike. Episode 38 of Destroy All Software had a great definition for that: a small, disposable experiment. This is used to determine what some code should do and how it should do it.

I may end up tossing this code at the end and chalking it up to the lessons learned in the attempt. If I don't, then it will be time to go back and ensure that I flesh things out.

Given that, let's pass on the tests this time and just try to get something working.

When I Say Minimal, I Mean It

You may have noticed that I named the project OK, for now. That's just a working title because I don't want to get too hung up on finding the perfect name. But I did have something in mind when I chose it.

Perl has a Test::Simple library that I sometimes miss from my time in that world. After you load it, you can start calling an ok() function and passing it tests. I like the feel of that.

In the Ruby would, the testing framework that has changed the way I think the most is Riot. It's closer to a more full featured testing framework, but assertions are just blocks that return a boolean value, a lot like Perl's ok().

You know how we're always saying that a test should probably just have one assertion in it? Well, this approach forces that line of thought. It also mean your tests are just Ruby, plus a call to ok().

That's what I had in mind when I named the project and that's where I want to begin.

I finally started typing a little code and ended up with this in lib/ok.rb:

def ok(&test)
  if test.call
    puts "Success."
  else
    puts "Failed."
  end
end

That's the minimal test I had in mind. Obviously, it's missing almost every nicety we can think of, but it's a starting point and we can iterate from here.

I wanted to check my work, without writing tests, so I added an example/ directory and stuck this code in a basic_test.rb inside of there:

ok { true  }
ok { false }

Then I checked my work:

$ ruby -I lib -r ok example/basic_test.rb 
Success.
Failed.

We're off and running.

If you aren't familiar with the command-line options I used above, -I includes directories in Ruby's $LOAD_PATH (or $:) and -r requires a file from that path. I'm using these to inject my library code because I don't want to worry about how I should be running tests yet.

Now, don't underestimate what we have already achieved here. Check out this test that I added as example/compound_test.rb:

ok { 42 == 40 + 2 && 42.is_a?(Integer) }

I didn't need assert_instance_of() or any other tricks for that. I'm just writing Ruby here. The same knowledge is useful to my code and tests and I love that.

Also, I've already copied RSpec's before(:all) feature (setup() in Riot). Did you see it sneak in there? No? Let me show you. Here's my example/setup_test.rb:

setup = 42  # or whatever

ok { setup == 40 + 2      }
ok { setup.is_a?(Integer) }

I'm not joking here. It's almost silly that RSpec has before(:all). Ruby's blocks are closures, so we can do some work earlier in the file and use it later. (Yes, I know RSpec's before(:all) is scoped to a set of examples, but mine can almost be the same if you consider a file a set of examples. The goal is to bend our thinking in some new directions here.)

Of course, I do miss before(:each) and let(). But first, we should probably tackle better output for these tests.

Building a Printer

The message "Failed." is not too helpful. What failed? Where do I go to fix it? This is information we need to have.

It would also be nice to see counts of what passed and failed.

It's sounding like this printer needs to track some state to do it's job right, so I think I'm ready to build a real object. I opened up lib/ok/printer.rb and entered some code. I also made two subclasses to cover the familiar cases: lib/ok/printer/dot_printer.rb and lib/ok/printer/spec_printer.rb. I bounced back and forth between those files, fleshing things out, until I had something I liked. I'll walk you through the code now. It's a lot, so we'll take it in pieces, but nothing in here is very complex. Let's start with the base Printer:

module OK
  class Printer
    def initialize(io = $stdout)
      @io      = io
      @color   = !ENV["RUBY_OK_NO_COLOR"] && @io.tty?
      @passed  = 0
      @failed  = 0
      @errors  = 0
      @started = Time.now
      @term    = :test
    end
  end
end

The constructor should be pretty straight forward. I take in an optional IO object to print to, defaulting to the standard output. This will make testing easier when I get there, since I can just pass a more convenient IO substitute. Also note my that my use of $stdout is preferable to STDOUT. The @color variable just tracks whether or not it's OK to colorize our output. We default to doing that if we are run interactively (from a TTY) and the choice wasn't overridden in the environment. The rest of the constructor just initializes some counts, a start time, and configurable name.

module OK
  class Printer
    RED_COLOR    = "\e[31m"
    GREEN_COLOR  = "\e[32m"
    YELLOW_COLOR = "\e[33m"
    CLEAR_COLOR  = "\e[0m"

    attr_writer :color

    def color?
      @color
    end

    def colorize(name, content)
      return content unless color?
      [ self.class.const_get("#{name.to_s.upcase}_COLOR"),
        content,
        CLEAR_COLOR ].join
    end
  end
end

These methods allow you to enable or disable color programmatically and use the color constants that I stole from HighLine to conveniently wrap some content in ANSI color escapes, when color is active.

module OK
  class Printer
    attr_reader :passed, :failed, :errors

    def tests
      @passed + @failed + @errors
    end

    def count_passed; @passed += 1 end
    def count_failed; @failed += 1 end
    def count_error;  @errors += 1 end

    def all_passed?
      @failed + @errors == 0
    end

    def record(test)
      if test.passed?
        count_passed
        [:green, test.description]
      else
        if test.failed?
          count_failed
        else
          count_error
        end
        [:red, test.description]
      end
    end
  end
end

The reader methods and tests() give access to some counters and the count_*() methods give us a way to bump them. I also added an all_passed?() utility for seeing if the entire run is green. These methods probably indicate a need for a Statistics or at least a TestSuite object, but doing it in the Printer is good enough for this experiment.

The big story in this section though is the record() method. This method adds the passed test to our tracking. It bumps the correct count based on the results of the test and then returns details for printing (a color and description).

module OK
  class Printer
    def pluralize(number, singular)
      "#{number} #{singular}#{'s' unless number == 1}"
    end

    attr_accessor :term

    def plural_term
      pluralize(2, @term)[2..-1]
    end

    def elapsed
      Time.now - @started
    end
  end
end

Before we get to the good stuff, here are a few more utility methods. My version of pluralize() isn't as fancy as the one in ActiveSupport, but it is a clever use of Ruby features that I have explained before. I just added term() and plural_term() so I could please the "tests" and "specs" crowd. Finally, elapsed() tells us how long this Printer has been running.

module OK
  class Printer
    def print(*args) @io.print(*args) end
    def puts(*args)  @io.puts(*args)  end

    def print_running
      # do nothing:  optional
    end

    def print_test(test)
      fail NotImplementedError,
           "Printer subclasses must provide print_test(test)"
    end

    def print_summary
      puts "\n"                                           + # blank line
           "Finished #{plural_term} in %.6fs\n" % elapsed +
           colorize( all_passed? ? :green : :red,
                     { term     => tests,
                       :failure => failed,
                       :error   => errors }.map { |name, count|
                       pluralize(count, name)
                     }.join(", ") )
    end
  end
end

These methods are the heart of the Printer. My wrappers over print() and puts() just redirect output operations from this object to the stored @io instance. I'll arrange for print_running() to be called as testing starts, but I don't think it needs to be required. That's why I defined an empty version of it. By the way, I do feel explaining why you chose to do nothing is a legitimate use for comments, which I am pretty sparing with. Defining print_test() to fail() with an Exception isn't a common Ruby practice, but it helped me envision what I wanted my subclasses to look like, before I was ready to write them. I like programming from high levels and working my way down like this. That method will be called as each test is run, to output the results. The third printing method, print_summary(), ends the output with a common summary format.

Having it Both Ways

We are ready for those subclasses now and they are pretty simple. Let's build two common strategies. Here's the code I stuck in lib/ok/printer/dot_printer.rb:

module OK
  class Printer
    class DotPrinter < Printer
      def initialize(*args)
        super
        @issues = [ ]
      end

      attr_reader :issues

      def print_running
        puts "Running #{plural_term}:"
      end

      def print_test(test)
        issues << test unless test.passed?
        color, result = record(test)
        print(colorize(color, result[0].tr("S", ".")))
      end

      def print_summary
        puts "\n\n" +  # end dot line
             issues.map.with_index { |issue, i|
               "#{i}) #{issue.description}: #{issue.name}\n" +
               issue.backtrace(2)
             }.join
        super
      end
    end
  end
end

That should be pretty close to the traditional test/unit or MiniTest style output. The print_test() method prints the . (Success), F(ailure), and E(rror) notifiers as things run. It also remembers any tests that don't pass. Those can then be summarized in print_summary(), before we hand-off to the overridden version of the method to add the traditional summary.

Let's add a lib/ok/printer/spec_printer.rb for those who prefer an RSpec-like output:

module OK
  class Printer
    class SpecPrinter < Printer
      def initialize(*args)
        super
        self.term = :spec
        @file     = nil
      end

      def print_file(file)
        if @file.nil? or @file != file
          @file = file
          puts @file
        end
      end

      def print_test(test)
        print_file(test.file)
        color, modifier = record(test)
        if modifier == "Success"
          modifier = ""
        else
          modifier << ": "
        end
        puts "  #{colorize(color, "#{modifier}#{test.name}")}"
        puts test.backtrace(4) unless test.passed?
      end
    end
  end
end

There's no real magic here. I swap out the term(), invoke print_file() each time our context changes, and build up an indented spec format using test names in the rest of print_test(). My tests were just blocks of code before and didn't have names, but these printers have made it clear that I need to make that change, so I can refer to them for the user.

If Wishes Were Classes

Of course, I was just pretending I had some mythical Test object while I wrote those printers. I recently heard that called, "Programming by Wishful Thinking." I love that description, because it explains exactly why you should do it. When you are the middle of naturally writing code, don't stop to resolve some dependency. Just keep programming as if you had it and it does exactly what you need. You can make that true after the fact. For me, that's now, so I added this code in lib/ok/test.rb:

module OK
  class Test
    def initialize(name, test)
      @name   = name
      @test   = test
      @result = !!test.call rescue $!
    end

    attr_reader :name

    def passed?
      !(failed? || error?)
    end

    def failed?
      @result == false
    end

    def error?
      @result.is_a?(Exception)
    end

    def description
      if    passed? then "Success"
      elsif failed? then "Failed"
      else               "Error"
      end
    end

    def file
      @test.source_location.first
    end

    def backtrace(indent = 0)
      backtrace = if    failed? then @test.binding.eval("caller(0)").reverse
                  elsif error?  then @result.backtrace
                  else               [ ]
                  end
      backtrace.reject { |line| line =~ %r{\blib/ok\b}     }  # scrub clean
               .map    { |line| "#{' ' * indent}#{line}\n" }
               .join
    end
  end
end

Perfect. My wishing came true!

Probably the strangest line of code in this whole class is when I set @result in the constructor. I realize the !! construct isn't too sexy, but that's how we can ensure a literal true or false result in Ruby. But there's a third option: rescue $! will catch and store the Exception object in the case of an error being trigger by the test code. (I learned this and other awesome error handling tricks from Exceptional Ruby.)

The majority of the rest of this code is pretty mundane. It's mostly methods for querying status. Similarly, description() returns the status in word form. The final two methods track source file context with file() being a thin wrapper over Ruby's new source_location() and backtrace() returning an indented call stack for the test, scrubbed of references to our library.

Viewing Our Work

Alright, let's crudely touch up lib/ok.rb to handle these additions:

require_relative "ok/printer"
require_relative "ok/printer/dot_printer"
require_relative "ok/printer/spec_printer"
require_relative "ok/test"

$printer = OK::Printer::DotPrinter.new
$printer.print_running
at_exit { $printer.print_summary }

def ok(name, &test)
  $printer.print_test(OK::Test.new(name, test))
end

You can see the requiring of the new pieces here. Note the use of require_relative() to do this, since Ruby 1.9 no longer has the current directory in the load path.

The middle chunk of code just throws a Printer in a global variable and sets up the print cycle. We can call print_running() to start things off and arrange for the print_summary() method to be called at_exit().

I also updated ok() to use the Printer and Test objects.

To see the results of this work, I added names to my examples. Here's the updated version of my basic tests:

ok("Is true")  { true        }
ok("Is false") { false       }
ok("Is error") { fail "Oops" }

Then it was time to run them again:

$ ruby -I lib -r ok example/basic_test.rb 
Running tests:
.FE

0) Failure: Is false
  example/basic_test.rb:2:in `<main>'
1) Error: Is error
  example/basic_test.rb:3:in `block in <main>'
  example/basic_test.rb:3:in `<main>'

Finished tests in 0.000300s
3 tests, 1 failure, 1 error

I can also swap the code to use the SpecPrinter and see how that looks:

$ ruby -I lib -r ok example/basic_test.rb 
example/basic_test.rb
  Is true
  Failed: Is false
    example/basic_test.rb:2:in `<main>'
    example/basic_test.rb:2:in `<main>'
  Error: Is error
    example/basic_test.rb:3:in `block in <main>'
    example/basic_test.rb:3:in `<main>'

Finished specs in 0.000298s
3 specs, 1 failure, 1 error

We're Not Done

Obviously, this isn't done yet. My solution still isn't ideal for multiple reasons:

I haven't really addressed how to run tests
The problem of test references remains to some extent
There is the need for a global variable
Using at_exit() is clunky
I still need to manually switch printers
It would be nice to get some helpers like before(:each) and let()

It may surprise you, but I believe all of those things are the same issue in disguise. I'm going to try tackling all of them in the next article, so stay tuned…