Gray Soft / Rubies in the Rough / Test Driving an Algorithm (Part 1)

1

FEB
2012

Test Driving an Algorithm (Part 1)

This post is part of a series.

I want to take a look at some of the differences between Cowboy Coding and Test-Driven Development. To do that, let's solve a problem both ways and see what we can learn from the exercise.

A Puzzle

I needed some random problem to solve in this article and the PuzzleNode site is pretty handy for that. Two programmers I've been working with have recently experimented with problem number 11, Hitting Rock Bottom, so I am familiar with it. Let's use that.

You will probably want to read through the challenge before finishing this article. You may even want to try solving it yourself, just so you'll be more familiar with what I am doing here. The short, short story is that this problem is about simulating the flow of water into a cave for a fixed amount of time and then measuring the depth at each point. It doesn't take too long to solve.

Setup

Let's setup a project. I created a few directories and pulled down the data files given with the problem:

$ mkdir -p hitting_rock_bottom/{bin,data,lib,spec}
$ cd hitting_rock_bottom/data/
$ for f in simple_cave.txt simple_out.txt complex_cave.txt
> do
>   curl --silent -O \
>   "http://puzzlenode.com/puzzles/11-hitting-rock-bottom/attachments/$f"
> done
$ cd ..

You can see that I used a trivial for loop to get the files. I just copied the first link from the problem description out of my browser (with a right click and "Copy Link"). Then I replaced the file name in the URL.

It was many years before I actually sat down and learned enough Bash to properly be able to write a loop like this. I regret waiting so long too. Teaching yourself a little Bash (or Zsh or whatever) is worth it. I would at least recommend learning some loops and how to construct strings with some combination of raw content, variables, and the output of processes. You can get a lot done with just those tidbits of knowledge. Learning about how Bash manages processes has also been very helpful to me.

Tip 1: learning a bit about how to use your shell pays dividends over time.

Getting back to the files, there are a few things I noted about them:

complex_cave.txt won't be of much help in this exercise, since I won't have the right answers for it until I am done
There's another cave given in the problem example that I can steal if needed, but it's really just an even simpler version of simple_cave.txt
The simple_cave.txt file is hardcoded to 100 units, but the problem description also shows the solution for 45 units which is a useful edge case

I made a mental note of these facts in case I see a need for them later on.

Cowboy Coding

I have a resistance to using a lot of ceremony on a problem like this. I can make myself do it, and I will, but my natural tendencies are to just push something out. It may be instructive to look at why that is, so let's start there.

The first thing we have to do is parse the input. In doing that, I want the tightest possible feedback loop I can get. So I start with the smallest possible step I can imagine: can I read that number off of the first line? I put this code in bin/cowboy:

#!/usr/bin/env ruby -w

units = ARGF.gets.to_i
p units

Then I made that file executable:

$ chmod +x bin/cowboy

Now I was ready to test it:

$ bin/cowboy data/simple_cave.txt
100

We're off to the races. There's plenty to discuss here though.

First, I know I've said this in the past, but it's worth repeating: turning on Ruby's warnings (with -w) is turning on free debugging mode. More and more libraries are cleaning up their warnings, so the excuses not to use this feature are rapidly falling away. It's time.

Tip 2: use warnings for free debugging.

Another point to notice here is my use of the magic ARGF. If you aren't familiar with this constant, it's Ruby's way of providing traditional Unix input handling. If file names are given as program arguments, they will be looped over in left-to-right order by ARGF. This effectively concatenates the files listed into a single input stream. If no files are given, ARGF will read from $stdin instead.

Let's go back to that file concatenation point for a second. Here I'm just using one file, so that doesn't come into play. It is there as an option though and it might prove useful to us. For example, remember that simple_cave.txt is hardcoded to 100 units, but that we also have a useful example with 45 units. If I split the cave into a separate file, I could create header files with each unit count. Then I could just put the count file I want followed by the cave file name on the command-line to test either option. I make another mental note that this could be helpful down the road.

The final thing I want to say about this tiny chunk of code I am beating to death is that a form of Test-Driven Development in disguise. Now, before you all start sending hate mail, please note that I said, "a form of." I realize this isn't automated and I'm not going to keep repeating this particular test, but that's not the point. The point is that I could have kept throwing ideas into this file, then held a debugging session at the end to sort out issues and get it all working. I didn't. I went for the immediate feedback and started double-checking myself as fast as humanly possible. Perhaps this is more properly called Feedback Driven Development, but I learned it from TDD and I strongly believing it's a form of testing.

Tip 3: making testing a part of what you do, especially when you aren't officially testing.

Close the Loop

The next challenge is to get the cave into some useful data structure. I'll start with an Array of Arrays, since that seems pretty obvious in this case. I updated the code to this:

#!/usr/bin/env ruby -w

units = ARGF.gets.to_i
cave  = [ ]
ARGF.each do |line|
  next unless line =~ /\S/  # skip blank lines
  cave << line.strip.split("")
end
p cave

I ran that version, but the truth is that it spewed a bunch of hard to read output. It looked fairly correct, but it was too hard for me to be sure.

Given that, I wanted to close the loop and get back to something I can visually verify a bit easier. This problem really breaks down to this process: parse the input, simulate water movement, write out the depths. That simulate step sounds tough, so I'm going to ignore it for now and skip to the end bit. However, water depths don't sound all that helpful to me either. I would rather see the pretty pictures of water. I'm going to angle for that and assume the depths will be easy enough to get later, when I am ready for them.

I edited bin/cowboy again, adding some output code:

#!/usr/bin/env ruby -w

units = ARGF.gets.to_i
cave  = [ ]
ARGF.each do |line|
  next unless line =~ /\S/  # skip blank lines
  cave << line.strip.split("")
end

# ... simulate water flow here...

cave.each do |row|
  puts row.join
end

That got me back to some useful output, the cave itself:

$ bin/cowboy data/simple_cave.txt 
################################
~                              #
#         ####                 #
###       ####                ##
###       ####              ####
#######   #######         ######
#######   ###########     ######
################################

It's worth noting here again that I went with the standard Unix output option: print to $stdout. The reasoning is still the same. It just gives us more options. By default, it shows in my terminal so I can see what's going on. But I can easily redirect that content to a file or pipe it to another process. Redirecting it into a file will be handy when I'm ready to upload my solution. I still have options, which is great.

Tip 4: if you don't have a reason to do otherwise, model input and output on the Unix filter model for maximum flexibility.

The Wall

I am now to the hard wall that needs climbing.

First, I just want to make sure that I could add some water, so I do a little manual simulation by replacing the work comment with this line:

cave[1][1] = "~"

Then I prove that works as expected:

$ bin/cowboy data/simple_cave.txt 
################################
~~                             #
#         ####                 #
###       ####                ##
###       ####              ####
#######   #######         ######
#######   ###########     ######
################################

That shows me I'm good to go for an algorithm. I just need to make the code in that section do the right thing and I will be able to see the results.

I brainstorm and come up with the following simple but slow algorithm:

We always want to add water to the lowest, rightmost area possible
Walk the grid in reverse: bottom to top and right to left
If the current cell is empty and the cell to above or to the left has water, fill the current cell
End the search and start the process over

I coded that process up, landing on this:

#!/usr/bin/env ruby -w

units = ARGF.gets.to_i
cave  = [ ]
ARGF.each do |line|
  next unless line =~ /\S/  # skip blank lines
  cave << line.strip.split("")
end

(units - 1).times do
  catch(:added) do
    cave.each_with_index.reverse_each do |row, y|
      row.each_with_index.reverse_each do |cell, x|
        if cell == " " && (cave[y - 1][x] == "~" || cave[y][x - 1] == "~")
          cell.replace("~")
          throw :added
        end
      end
    end
  end
end

cave.each do |row|
  puts row.join
end

This worked:

$ bin/cowboy data/simple_cave.txt 
################################
~~~~~~~~~~~~~~~                #
#~~~~~~~~~####~~~~~~~~~~~~     #
###~~~~~~~####~~~~~~~~~~~~~~~~##
###~~~~~~~####~~~~~~~~~~~~~~####
#######~~~#######~~~~~~~~~######
#######~~~###########~~~~~######
################################

However, there are definitely things I don't like about the code above. First, it is too complex. I had to use some dirty tricks. For example, I am using replace() to avoid changing the contents of a collection while iterating. I don't think it would matter for this specific case, but that's just a defensive strategy that I have hardwired by now.

I also had to use catch() and throw() to break out of the nested iteration. This is how you unwind the call stack in Ruby without using an Excpetion for flow control. When you call catch(), you pass some Symbol that it will watch for. Then, anytime in the catch() block, you can call throw() with a Symbol. It will walk backward up the call stack until if finds a catch() that wants to handle that Symbol. Things progress forward after that matching catch() call. You've likely recently read about how great this feature can be, since that post has been making the rounds.

These issues are primarily because I don't have any abstractions to lean on here. If the search were in a method, I could just return instead of using catch() and throw(). I thought about moving it into a method for that reason, but then it would be a little odd that just the middle third of this script is extracted. That would probably lead me to extract the parsing and output as well. Of course these changes lead to better code, but that's more than I wanted to do here.

The biggest problem though is that I had to build the entire hard part all at once. I couldn't find a good place to stop and check in the middle. That worked because I was pretty familiar with this problem and knew what to try, but it was still harder than it needed to be. I made some minor mistakes along the way and had to debug them before I could finish writing this section. That's the price of Cowboy Coding a solution, in my opinion.

A Right Answer

I'm seeing the same pretty pictures we see in the problem description, but that's not the expected final output. I went into the code one last time to add the counts and make the picture optional:

#!/usr/bin/env ruby -w

DISPLAY = ARGV.delete("-d")

units = ARGF.gets.to_i
cave  = [ ]
ARGF.each do |line|
  next unless line =~ /\S/  # skip blank lines
  cave << line.strip.split("")
end

(units - 1).times do
  catch :added do
    cave.each_with_index.reverse_each do |row, y|
      row.each_with_index.reverse_each do |cell, x|
        if cell == " " and (cave[y - 1][x] == "~" or cave[y][x - 1] == "~")
          cell.replace("~")
          throw :added
        end
      end
    end
  end
end

if DISPLAY
  cave.each do |row|
    puts row.join
  end
  puts
end
puts cave.transpose.map { |column|
  str = column.join
  str.include?("~ ") ? "~" : str.count("~")
}.join(" ")

The code to set and check the DISPLAY flag just hides the cave picture, unless we request it:

$ bin/cowboy data/simple_cave.txt 
1 2 2 4 4 4 4 6 6 6 1 1 1 1 4 3 3 4 4 4 4 5 5 5 5 5 2 2 1 1 0 0
$ bin/cowboy -d data/simple_cave.txt 
################################
~~~~~~~~~~~~~~~                #
#~~~~~~~~~####~~~~~~~~~~~~     #
###~~~~~~~####~~~~~~~~~~~~~~~~##
###~~~~~~~####~~~~~~~~~~~~~~####
#######~~~#######~~~~~~~~~######
#######~~~###########~~~~~######
################################

1 2 2 4 4 4 4 6 6 6 1 1 1 1 4 3 3 4 4 4 4 5 5 5 5 5 2 2 1 1 0 0

The rest of the code just adds the depth counts to the output. I accomplish that mainly by flipping the Array around with transpose() so I can work in columns. From there it's a pretty simple matter to check the special case (space under the water) or count water cells.

We can see that this solution agrees with the solution provided in the problem:

$ bin/cowboy data/simple_cave.txt 
1 2 2 4 4 4 4 6 6 6 1 1 1 1 4 3 3 4 4 4 4 5 5 5 5 5 2 2 1 1 0 0
$ cat data/simple_out.txt 
1 2 2 4 4 4 4 6 6 6 1 1 1 1 4 3 3 4 4 4 4 5 5 5 5 5 2 2 1 1 0 0
$ diff <(bin/cowboy data/simple_cave.txt) <(cat data/simple_out.txt)

That last line is another example of helpful Bash knowledge. In this case it allows me to run diff on the output of processes, instead of files. Since diff doesn't show any differences, we know they are the same.

I thought about building the edge case cave and verifying that, but then I realized I could cheat. The complex answer probably requires the edge case, otherwise there's not much point in mentioning it. If it does, it's faster to just let the site judge that solution for me. I tried it. It includes the edge case and it validated. Problem solved.