Gray Soft / Rubies in the Rough / Test Driving an Algorithm (Part 2)

11

FEB
2012

Test Driving an Algorithm (Part 2)

This post is part of a series.

In the last article, I built a quick and dirty solution to PuzzleNode's Hitting Rock Bottom puzzle. I didn't use specs, objects, or even multiple files. In this article, I'll repeat the exercise but using all of those elements. Let's see how that affects the results.

The Disciplined Approach

This time, I'll reign in my desire to push forward and solve the problem more carefully, without having to build a fully formed algorithm all at once.

I still think the input is the right place to start. I need to read in the units. Let's add a spec for that in spec/parser_spec.rb:

require "minitest/autorun"
require "stringio"

require "hitting_rock_bottom/parser"

describe HittingRockBottom::Parser do
  let(:units)  { 42 }
  let(:io)     { StringIO.new(units.to_s) }
  let(:parser) { HittingRockBottom::Parser.new(io) }

  it "reads the units number from the beginning of the stream" do
    parser.units.must_equal(units)
  end
end

If you haven't seen the standard stringio library before, it simply wraps a String with an IO interface. That's will allow me to call gets() and other IO methods on it in my implementation. It's perfect for tests like this.

Here's the lib/hitting_rock_bottom/parser.rb code that gets that spec passing:

module HittingRockBottom
  class Parser
    def initialize(io)
      @units = io.gets.to_i
    end

    attr_reader :units
  end
end

Now we need to read in the cave. My first thought was to use similar tactics to what I did before, but then I'll need to verify that it's right. I don't want to do that with a must_equal() on some big Array. I want to be able to ask questions about individual cells. That made it sound like I needed a Cell object, so I created a spec/cell_spec.rb file and added this code:

require "minitest/autorun"

require "hitting_rock_bottom/cell"

describe HittingRockBottom::Cell do
  MAPPING = {rock: "#", open: " ", water: "~"}

  def cell(*args)
    HittingRockBottom::Cell.new(*args)
  end

  it "knows its type by symbol" do
    MAPPING.each do |name, symbol|
      cell = cell(symbol)
      cell.send("#{name}?").must_equal(true)
      MAPPING.each do |other_name, other_symbol|
        next if symbol == other_symbol
        cell.send("#{other_name}?").must_equal(false)
      end
    end
  end
end

You can see from this that I don't always follow the "one expectation per spec" guideline. I'm more of a "whatever I need to get the spec to make sense" guy. I know a lot of people who would break the above into three or even six different specs, but that seems to be more tedious than value adding for a case this simple. I think it's OK to be more pragmatic than that.

I also used def here, which you don't see a lot of with specs in the wild. I favor def over let() when I would like to pass some arguments. Just remember that it's not memoized for that spec's run, like let() is.

Tip 1: you can always fallback to plain Ruby constructs, like methods, even if you are writing specs, Rails code, or whatever.

Here's the code for lib/hitting_rock_bottom/cell.rb that gets the spec passing:

module HittingRockBottom
  class Cell
    def initialize(symbol)
      @symbol = symbol
    end

    def rock?
      @symbol == "#"
    end

    def open?
      @symbol == " "
    end

    def water?
      @symbol == "~"
    end
  end
end

I'm sure that's super straight forward.

While we're handling this Cell object, we already know that it will need one more thing. Let's provide a way to fill it with water. First, I added these two specs to spec/cell_spec.rb:

it "allows the filling of open cells" do
  cell = cell(" ")
  cell.fill
  cell.water?.must_equal(true)
  cell.open?.must_equal(false)
end

it "fails with an error if you try to fill a non-open cell" do
  (MAPPING.values - [" "]).each do |symbol|
    lambda { cell(symbol).fill }.must_raise(RuntimeError)
  end
end

Yes, I really did write both of those before I added the fill() method. Again, it's fine to be pragmatic.

Tip 2: nobody is keeping score for how much you make yourself suffer while writing specs/tests.

Then I added the method that would get them passing to lib/hitting_rock_bottom/cell.rb:

def fill
  fail "Only open cells can be filled with water" unless open?
  @symbol = "~"
end

Jim Weirich has a great rule for when to use fail() and when to use raise() that I like to follow. If you are reraising an Exception from a rescue clause, go with raise(). Otherwise, use fail() because it more clearly states what is happening in the code.

Our Cell it ready to go, but we're going to need a container for them. It's time to start a spec/cave_spec.rb. I don't want to address the entire Cave object at this time. I just need something that can hold Cell objects and allow me to access them by coordinates. Here's what I have in mind:

require "minitest/autorun"

require "hitting_rock_bottom/cave"

describe HittingRockBottom::Cave do
  def cave(*args)
    HittingRockBottom::Cave.new(*args)
  end

  it "allows access to contents by coordinates" do
    cave = cave( [ ["#", "#", "#"],
                   ["~", " ", "#"],
                   ["#", "#", "#"] ] )
    cave[0, 0].must_equal("#")
    cave[0, 1].must_equal("~")
    cave[1, 1].must_equal(" ")
  end
end

Notice that I haven't even bothered with bringing Cell objects into this. They aren't the point, so it's just a distraction.

It's also worth noting that I switched to an X, Y coordinate format. That's easier on my brain than the row major ordering I was using in the previous solution.

Here's the initial code for lib/hitting_rock_bottom/cave.rb that get's these specs passing:

module HittingRockBottom
  class Cave
    def initialize(cells)
      @cells = cells
    end

    def [](x, y)
      @cells[y] && @cells[y][x]
    end
  end
end

I am now ready to return to spec/parser_spec.rb and prove that it can read in a proper Cave with Cell contents. First, I updated the let(:io) call to include a small cave:

  let(:io)     { StringIO.new(<<-END_INPUT.gsub(/^[ \t]+/, "")) }
  #{units}

  ######
  ~    #
  # ## #
  ######
  END_INPUT

The "heredoc" allows me to layout the cave as it would be in the file while the call to gsub() strips off the indentation. I'm trying to keep the code reading nicely.

Here are the expectations for the cave:

it "reads the cave with cell contents" do
  parser.cave.must_be_instance_of(HittingRockBottom::Cave)
  parser.cave[0, 0].must_be_instance_of(HittingRockBottom::Cell)
end

it "arranges the cells to match the coordinates from the stream" do
  parser.cave[0, 1].water?.must_equal(true)
  parser.cave[1, 2].open?.must_equal(true)
  parser.cave[2, 2].rock?.must_equal(true)
end

This is how I updated the Parser to get them passing:

require "hitting_rock_bottom/cell"
require "hitting_rock_bottom/cave"

module HittingRockBottom
  class Parser
    def initialize(io)
      @units = io.gets.to_i
      cave   = [ ]
      io.each do |line|
        next unless line =~ /\S/  # skip blank lines
        cave << [ ]
        line.scan(/./) { |cell| cave.last << Cell.new(cell) }
      end
      @cave  = Cave.new(cave)
    end

    attr_reader :units, :cave
  end
end

Note that I added the proper require statements, since Parser now depends on other objects. The rest of the changes are mainly just adding some parsing code similar to what we used before.

I think parsing is now covered. I want to prove that the specs all run together though, so I added a spec/all.rb file, with these contents:

require "cell_spec"
require "cave_spec"
require "parser_spec"

Then I run all of the specs together:

$ ruby -I lib:spec spec/all.rb
Run options: --seed 50650

# Running tests:

.......

Finished tests in 0.002287s, 3060.7783 tests/s, 9619.5890 assertions/s.

7 tests, 22 assertions, 0 failures, 0 errors, 0 skips

An Algorithm, Piece by Piece

Now we get see if I can really build the algorithm in sections this way.

First, we need to know where the water starts. It looks like it's always at 0, 1 in the examples given, but I see no reason to assume that. Let's add a spec/stream_spec.rb file and start defining the rules:

require "minitest/autorun"
require "stringio"

require "hitting_rock_bottom/parser"
require "hitting_rock_bottom/stream"

describe HittingRockBottom::Stream do
  def stream(cave, units = 42)
    io     = StringIO.new("#{units}\n\n#{cave}".gsub(/^[ \t]+/, ""))
    parser = HittingRockBottom::Parser.new(io)
    HittingRockBottom::Stream.new(parser.cave)
  end

  it "knows where the head of the stream is" do
    stream(<<-END_CAVE).head.must_equal([0, 1])
    ###
    ~ #
    ###
    END_CAVE

    stream(<<-END_CAVE).head.must_equal([1, 1])
    ###
    ~~#
    ###
    END_CAVE
  end
end

I reused the Parser here to make it easy to define Cave objects. This is a dependency of this spec file, not the Stream itself, so the require belongs here.

Here's the lib/hitting_rock_bottom/stream.rb code that lead me to:

module HittingRockBottom
  class Stream
    def initialize(cave)
      @cave = cave
      @head = [0, 0]
      find_head
    end

    attr_reader :head

    private

    def find_head
      find_tail
      follow_water
    end

    def find_tail
      until @cave[*@head].water?
        @head[1] += 1
      end
    end

    def follow_water
      loop do
        if @cave[@head[0], @head[1] + 1].water?
          @head[1] += 1
        elsif @cave[@head[0] + 1, @head[1]].water?
          @head[0] += 1
        else
          break
        end
      end
    end
  end
end

The code expects a Cave object with Cell contents, but I don't view that as a dependency in the dynamic world of Ruby. It may be some other object with the same API. After all, that's the point of using objects. Whatever they are, they are constructed before we ever get here and just handed to us, so the require isn't needed here.

Tip 3: a require is needed if you call a class (or module) method on some constant in the file.

The code itself isn't too complex. It walks down the left edge to find the start of the water. Then it moves down and over until it reaches the end.

I verified that the specs are green at this point and I'm ready for a refactoring step. The thing that bugs me about the code above is all of the direct access to the @head indices keeps it from reading well. Let's fix that:

module HittingRockBottom
  class Stream
    def initialize(cave)
      @cave = cave
      @head = [0, 0]
      find_head
    end

    attr_reader :head

    private

    def find_head
      find_tail
      follow_water
    end

    def find_tail
      until current.water?
        flow_down
      end
    end

    def follow_water
      loop do
        if down.water?
          flow_down
        elsif right.water?
          flow_right
        else
          break
        end
      end
    end

    def current
      @cave[*@head]
    end

    def down
      @cave[@head[0], @head[1] + 1]
    end

    def right
      @cave[@head[0] + 1, @head[1]]
    end

    def flow_down
      @head[1] += 1
    end

    def flow_right
      @head[0] += 1
    end
  end
end

It looks like I added a lot of code there, but really I just gave some names to common actions. I like how simple this makes the methods and how much better this reads.

Another reason I wanted to start by walking the water is that the water already in the starting input file is counted against the units the algorithm is expected to add. Given that, we need to count water as we move over it. Let's add a spec for that:

it "counts units as it moves over water" do
  stream(<<-END_CAVE).units.must_equal(1)
  ###
  ~ #
  ###
  END_CAVE

  stream(<<-END_CAVE).units.must_equal(2)
  ###
  ~~#
  ###
  END_CAVE
end

This just requires a few small changes:

module HittingRockBottom
  class Stream
    def initialize(cave)
      @cave  = cave
      @head  = [0, 0]
      @units = 0
      find_head
    end

    attr_reader :head, :units

    private

    # ...

    def flow_down
      @head[1] += 1
      count_units
    end

    def flow_right
      @head[0] += 1
      count_units
    end

    def count_units
      @units += 1 if current.water?
    end
  end
end

In summary that is:

Track the units in a variable
Provide access to it
Count units after each move

It's time to add some water. I want to code the cases right out of the problem description as much as possible. So I start here:

it "fills in water going down when possible" do
  stream(<<-END_CAVE).tap(&:fill).cave[1, 2].water?.must_equal(true)
  ###
  ~~#
  # #
  ###
  END_CAVE
end

Again, the changes are small:

module HittingRockBottom
  class Stream
    # ...

    attr_reader :cave, :head, :units

    def fill
      if down.open?
        down.fill
        flow_down
      end
    end

    # ...
  end
end

That's very similar to the earlier water movement, so I doubt it needs much explanation. Just remember that the inner fill() call is the method we added to Cell a while back.

The spec for moving right is almost the same:

it "fills in water going right when it must" do
  stream(<<-END_CAVE).tap(&:fill).cave[1, 1].water?.must_equal(true)
  ###
  ~ #
  # #
  ###
  END_CAVE
end

Here's the corresponding update to fill():

def fill
  if down.open?
    down.fill
    flow_down
  elsif right.open?
    right.fill
    flow_right
  end
end

Now for the tricky part. We need to handle the backtracking case. The spec is pretty easy to write:

it "backtracks when filling as needed" do
  stream(<<-END_CAVE).tap(&:fill).cave[2, 1].water?.must_equal(true)
  ####
  ~~ #
  #~~#
  ####
  END_CAVE
end

The real trick is how do we support that?

Whenever we are flowing down, we may be leaving behind a place that we could have gone right. We need to remember those. Then, if we run out of options, we can just resort to one of those options. At that point, we will want the last option we passed, so it's sounding like a stack to me.

This required several changes:

module HittingRockBottom
  class Stream
    def initialize(cave)
      @cave         = cave
      @head         = [0, 0]
      @units        = 0
      @backtracking = [ ]
      find_head
    end

    # ...

    def fill
      if down.open?
        down.fill
        flow_down
      elsif right.open?
        right.fill
        flow_right
      elsif can_backtrack?(:open)
        backtrack
        fill
      end
    end

    private

    # ...

    def follow_water
      loop do
        if down.water?
          flow_down
        elsif right.water?
          flow_right
        elsif can_backtrack?(:water)
          backtrack
        else
          break
        end
      end
    end

    # ...

    def flow_down
      if current.water? && !right.rock?
        @backtracking << @head.dup
      end
      @head[1] += 1
      count_units
    end

    # ...

    def can_backtrack?(to)
      !@backtracking.empty? &&
      @cave[@backtracking.last[0] + 1, @backtracking.last[1]].send("#{to}?")
    end

    def backtrack
      @head = @backtracking.pop
    end
  end
end

Start in initialize(). The change here was just the variable to hold the stack. Then move to flown_down() where I added code to push the stack as needed. I am pushing both air and water backtracking onto the stack, since the code does both. Water backtracking will always happen first, so it won't interfere with the normal backtracking process.

I could probably simplify this code a bit by ditching the water backtracking, which doesn't really seem to be needed by the problem. However, this is allowing me to write much nicer specs, since I can jump into the process at any point. I like that benefit, which is why I've kept it.

Tip 4: favor code that makes testing easier.

The next two methods to look at are follow_water() and fill(). They make use of the backtracking system via two helper methods. Glancing at those helpers, can_backtrack?() and backtrack(), should complete the picture.

This step probably felt a little like the big algorithm jump I made in the first solution to you readers, but it did not feel that way to me. Despite the large clipping of code above (so I can show you full methods) I only added 16 lines in this step. Most of them were quite simple. I did make two minor mistakes along the way that led to failing specs, but they were easy enough to resolve.

I also want to stress that I'm keeping a lot less of the problem in my mind at once this time. Even if this was a challenging step, it's just about the backtracking. That's a small subset of the challenge.

Finally, my algorithm is stronger this time. It always knows directly where to go next, without rescanning huge chunks of the cave like I did in the original version. This approach is more natural when I take the problem in pieces and it's leading me to a better design.

OK, we've nailed filling the cave cell by cell, but the problem wants us to fill a set number of units. Let's add a spec for that:

it "fill can repeat for a number of units" do
  stream = stream(<<-END_CAVE)
  ####
  ~  #
  #  #
  #  #
  ####
  END_CAVE
  stream.fill(7)
  1.upto(2) do |x|
    1.upto(3) do |y|
      stream.cave[x, y].water?.must_equal(true)
    end
  end
end

It's pretty easy to adapt the fill method to support this argument:

def fill(goal_units = @units + 1)
  while units < goal_units
    if down.open?
      down.fill
      flow_down
    elsif right.open?
      right.fill
      flow_right
    elsif can_backtrack?(:open)
      backtrack
      fill
    else
      break
    end
  end
end

The only clever part here is that I default the argument to one more than the current @units. This means that we keep the old behavior of advancing one cell when no argument is given. I'm so glad Ruby supports me using an instance variable and some calculations in a default like this.

With this code in place, all we are missing is output.

Less Output

Because I'm using specs now, I already have my tight feedback loop. This means that I don't really miss being able to see the maps. That's why I didn't handle output before I did the algorithm.

Now's the time though. I need to get the depths working. That seems like something that belongs in Cave, so I switch back to those specs. First, I'm going to give myself a simple way to stub out cells, so I don't need to actually depend on them for these tests:

def stub(type)
  Object.new.tap do |object|
    object.singleton_class.instance_eval do
      %w[water rock open].each do |query|
        define_method("#{query}?") do
          query == type.to_s
        end
      end
    end
  end
end

That just creates a stand-in object for a Cell. It implements the three type query methods, but will only return true for the passed type.

Using stubs, the spec for depths is pretty simple:

it "can calculate the water depth of each column" do
  water = stub(:water)
  rock  = stub(:rock)
  cave( [ [rock,  rock,  rock,  rock],
          [water, water, rock,  rock],
          [rock,  water, water, rock],
          [rock,  water, water, rock],
          [rock,  rock,  rock,  rock] ] ).depths.must_equal("1 3 2 0")
end

The code to make that pass is similar to what we had before:

def depths
  @cells.transpose.map { |column|
    column.count(&:water?)
  }.join(" ")
end

The count() iterator allows us to count only the blocks that pass the block's condition. In this case, we just want a true result from water?().

We also need to handle the falling water edge case. Here's the spec for that:

it "shows falling water depths as a tilde" do
  water = stub(:water)
  rock  = stub(:rock)
  open  = stub(:open)
  cave( [ [rock,  rock,  rock,  rock],
          [water, water, open,  rock],
          [rock,  water, open,  rock],
          [rock,  open,  open,  rock],
          [rock,  rock,  rock,  rock] ] ).depths.must_equal("1 ~ 0 0")
end

This requires adding some conditional logic to depths():

def depths
  @cells.transpose.map { |column|
    if column.each_cons(2).any? { |l, r| l.water? && r.open? }
      "~"
    else
      column.count(&:water?)
    end
  }.join(" ")
end

I used this Enumerator trick in the last article, but didn't discuss it there. Most iterators in Ruby now return an Enumerator when called without a block. You can think of that object as an iterator frozen in time. You can either advance it forward manually as needed or chain other iterators onto it for compound effects.

In the example above, the call to each_cons(2) prepares an iteration which will give me two consecutive cells at a time. It doesn't happen with that call though. Instead, I chain a call to the any?() iterator onto it, allowing me to ask a question about those two consecutive cells as they flow through the iteration.

We have all the pieces we need now. Let's wrap this up with some end to end specs and an executable.

The Interface

Before I add an executable for this new solution, I want to get it down to a simple method call. I don't want any logic in the executable itself, so having this primary method to hand off to makes that possible. This also makes it easy to test the whole system in action.

This time I'm going to write the code before I think about the specs. I add a new file in lib/hitting_rock_bottom.rb and place this code inside:

require "hitting_rock_bottom/parser"
require "hitting_rock_bottom/stream"

module HittingRockBottom
  module_function

  def solve(input, output = $stdout)
    parser = Parser.new(input)
    Stream.new(parser.cave).fill(parser.units)
    output.puts parser.cave.depths
  end
end

This is the same input/output model as before, but note that I'm passing the stream in with a sensible default. It's great to use the standard streams, as I said in the last article, but always leave yourself a way to change your mind programatically.

At this point I wrote a couple of specs to make sure I had it. I used the simple cave example that I already had in a file, and the edge case modification given in the problem description:

require "minitest/autorun"
require "stringio"

require "hitting_rock_bottom"

describe HittingRockBottom do
  let(:data_dir) {
    File.join(File.dirname(__FILE__), *%w[.. data])
  }
  let(:cave_path)     { File.join(data_dir, "simple_cave.txt") }
  let(:solution_path) { File.join(data_dir, "simple_out.txt")  }

  it "solves the simple cave example" do
    stdout, _ = capture_io do
      open(cave_path) do |simple_cave|
        HittingRockBottom.solve(simple_cave)
      end
    end
    stdout.must_equal(File.read(solution_path))
  end

  it "solves the simple cave edge case from the problem description" do
    stdout, _ = capture_io do
      simple_cave = File.read(cave_path)
      simple_cave.sub!(/\A100/, "45")
      HittingRockBottom.solve(StringIO.new(simple_cave))
    end
    stdout.must_equal( "1 2 2 4 4 4 4 6 6 6 1 1 1 1 ~ " +
                       "0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n" )
  end
end

I could have passed a StringIO object for the output, but it turns out that minitest can already capture $stdout and $stderr (not used here) for you. Any guesses for what it uses to do that? See for yourself.

Just to make sure everything is still green, I updated spec/all.rb with the new spec files:

require "cell_spec"
require "cave_spec"
require "parser_spec"
require "stream_spec"
require "hitting_rock_bottom_spec"

Then I ran the whole suite:

$ r -I lib:spec spec/all.rb 
Run options: --seed 2482

# Running tests:

.................

Finished tests in 0.004388s, 3874.2024 tests/s, 8887.8760 assertions/s.

17 tests, 39 assertions, 0 failures, 0 errors, 0 skips

Perfect.

Let's wrap this solution in an executable. I added bin/test_driven with this code:

#!/usr/bin/env ruby -w

begin
  require "hitting_rock_bottom"
rescue LoadError
  $LOAD_PATH << File.join(File.dirname(__FILE__), *%w[.. lib])
  retry
end

HittingRockBottom.solve(ARGF)

Are you confused by my tricky require above? I have a reason for it. If you turn your code into a RubyGem, it won't need to fiddle with the path. The gem makes sure your library code is in the path, so a normal require will work just fine. That's a little annoying while I'm developing though, since I would always need to run the code like this:

$ ruby -I lib bin/test_driven data/complex_cave.txt

I would rather just make the file executable and run it directly:

$ chmod +x bin/test_driven
$ bin/test_driven data/complex_cave.txt

I handle that with the special code above. First, I try the require. It will work if this code is loaded by RubyGems. We're not dealing with a gem in this case though, so a rescue the LoadError that signifies the failed load, add my lib directory to the path manually, and retry the require. This covers my development usage.

We can now compare the new code with the old to see that the algorithm we just built returns the same results and is quite a bit faster:

$ diff <(bin/cowboy data/complex_cave.txt) \
>      <(bin/test_driven data/complex_cave.txt)
$ time bin/cowboy data/complex_cave.txt > /dev/null

real    0m3.725s
user    0m3.711s
sys 0m0.008s
$ time bin/test_driven data/complex_cave.txt > /dev/null

real    0m0.147s
user    0m0.133s
sys 0m0.012s

That's a job well done, if you ask me.