Rubies in the Rough

This is where I try to teach how I think about programming.

20

JUL
2014

Dave's No Tests Challenge

I've mentioned before my difficulties in the 2014 IPSC. But taking one beating is no reason not to try again. The first loss just showed me that the contest still had more to teach me.

A buddy of mine has spent some time with the crossword problem and told me that he enjoyed it. I didn't try this problem during the actual event, but I was a little familiar with it from my friend's description.

To add to the fun, I decided this would be a great excuse to take up the recent challenge Dave Thomas gave to the Ruby Rogues: "Stop writing tests."

Step 1: Feedback Loops

Without tests to guide me, I really want to see what's going on. One of the biggest advantages of tests, in my opinion, is the feedback loop it provides. So I set out to provide my own feedback.

Since the problem at hand involves filling in a crossword board, the easiest feedback loop I could think of was to see the board as it fills in. The final board is also the required output. Therefor, I decided a good first step would just be to read the board into some data structure and write it back out. Once I had that, I could insert code between those steps to fill it in. And constantly seeing the board evolve would let me eyeball things for obvious mistakes.

I set out to make that happen, with the minimal amount of effort. First I created a project:

$ mkdir -p daves_no_tests_challenge/{bin,data,lib}
$ cd daves_no_tests_challenge/
$ git init
Initialized empty Git repository in /Users/jeg2/Desktop/daves_no_tests_challenge/.git/

Then I built a minimal data structure:

module Crossword
  class Board
    def initialize(squares)
      @squares = squares
    end

    attr_reader :squares
    private     :squares

    def to_s
      squares.map { |row| row.join }.join("\n")
    end
  end
end

That code just takes in and stringifies (for output) some rows of crossword squares. Eventually, this code will need to be able to add answers, but that isn't needed for this first pass.

Next I wrapped my fledgling data structure in some reading and writing code:

require_relative "board"

module Crossword
  class Solver
    def initialize(input, output)
      @input  = input
      @output = output
      @board  = nil
    end

    attr_reader :input, :output, :board
    private     :input, :output, :board

    def solve
      parse_board
      write_board
    end

    private

    def parse_board
      rows, columns = input.gets.scan(/\d+/).map(&:to_i)
      @board        = Board.new(
        Array.new(rows) { input.gets.chars.first(columns) }
      )
    end

    def write_board
      output.puts board
    end
  end
end

That code just parses the Board using the dimensions in the input file and writes it back out. This class gives me a place to hang some clue solving code, when I'm ready for that.

In preparation for adding an executable, I collapsed this process into a simple method call:

require_relative "crossword/solver"

module Crossword
  module_function

  def solve(*args)
    Solver.new(*args).solve
  end
end

Finally, I added the minimal binary:

#!/usr/bin/env ruby -w

require_relative "../lib/crossword"

Crossword.solve(ARGF, $stdout)

If we make that file executable and point it at the easy input, we can see our first empty crossword:

$ chmod +x bin/solve
$ bin/solve data/k1.in                                                                           
........#.#...#.#.#####.......
......######..#.#..#...####...
..####..#.#..#######...#......
....#.######..#.#..#.#.#......
....#...#.#...#.#.######......
....#...#.########.#.#.#####..
.#######....#...#..#.#.#......
.#..#.#.#######......#....#...
.#....#.....#.#.......#...#...
.#....#.########..#########...
.######.#.#.#.#.#.....#...#...
.#....#.#.#.#.#.#.##########..
.....########.#.#.....#...#...
.#..#.#.#.#...#.#########.#...
######...####.#.......#.#.#...
.#..#.#.#.#....############..#
.#..########.#....#...#.#....#
.#..#.#.#...#####.##########.#
.#..#.#.#..#.#....#.....#.#..#
...#########.#########..#.#..#
....#.#.#..#.#....#.#.########
......#.#..#.#########..#.#..#
.#.#..#.#..#.#..#.#.#...#.#...
.#.#.......#.#..#.#.#.#...#...
#####.#########.#.#.#.#...#.#.
.#.#..#.#..#....#..########.#.
.######.#..#######..#.#...#.#.
...#..#.#.......#...#.#.....#.
....######...######...#.....#.
......#.#.............#######.

Seeing this made me realize that there are two sides to crosswords: the board and the clues. I now had some visibility into one side, but what about clues?

For clues I wanted to know, what am I handling and what am I not. If I printed out the not yet handled stuff, I could scan such a list for similar clues and then write the code to handle them. I should then see the unhandled clue list shrink and some answers appear on the board. I could then just repeat that process until I am done.

This requires just two changes to my Solver:

# ...

module Crossword
  class Solver
    # ...

    def solve
      parse_board
      solve_clues
      write_board
    end

    private

    # ...

    def solve_clues
      %i[across down].each do |direction|
        count = gets.to_i
        count.times do
          row, column, clue = gets.split(" ", 3).map.with_index { |str, i|
            i < 2 ? str.to_i : str.strip
          }
          if false  # FIXME:  match a clue
            # FIXME:  write the answer to the board
          else
            warn "No match:  #{clue}"
          end
        end
      end
    end

    # ...
  end
end

I added a solve_clues() step to the overall process, then fleshed out that method just enough to parse clues and consider all of them failures. Notice that I got a bit hand wavy with how the solving stuff was actually going to work. It didn't matter yet. I also put the clue warnings on STDERR because they aren't legally part of the output and I wanted to be able to redirect them and the board independently.

Now I had a lot of output, but it showed me exactly what I needed to do:

No match:  Pokemon which evolves from Oddish
No match:  Pokemon which evolves from Spearow
No match:  Chemical element with the symbol Au
No match:  Pokemon with number #63
No match:  Pokemon with number #137
No match:  Last name of the President of the United States who held the office in 1987
No match:  Chemical element with the symbol Na
No match:  Chemical element with the symbol W
No match:  Pokemon which evolves into Arbok
No match:  Pokemon which evolves from Machoke
…
........#.#...#.#.#####.......
......######..#.#..#...####...
..####..#.#..#######...#......
....#.######..#.#..#.#.#......
....#...#.#...#.#.######......
....#...#.########.#.#.#####..
.#######....#...#..#.#.#......
.#..#.#.#######......#....#...
.#....#.....#.#.......#...#...
.#....#.########..#########...
.######.#.#.#.#.#.....#...#...
.#....#.#.#.#.#.#.##########..
.....########.#.#.....#...#...
.#..#.#.#.#...#.#########.#...
######...####.#.......#.#.#...
.#..#.#.#.#....############..#
.#..########.#....#...#.#....#
.#..#.#.#...#####.##########.#
.#..#.#.#..#.#....#.....#.#..#
...#########.#########..#.#..#
....#.#.#..#.#....#.#.########
......#.#..#.#########..#.#..#
.#.#..#.#..#.#..#.#.#...#.#...
.#.#.......#.#..#.#.#.#...#...
#####.#########.#.#.#.#...#.#.
.#.#..#.#..#....#..########.#.
.######.#..#######..#.#...#.#.
...#..#.#.......#...#.#.....#.
....######...######...#.....#.
......#.#.............#######.

Step 2: Strategies

Satisfied that I could now follow my process without using a test suite, it was time to make this problem easy to solve.

I realized that different clues would need different strategies to solve and I might need to write several of those. This made me want to add a little bit of infrastructure for loading and matching strategies. I figured I could solve that problem now, then focus in entirely on clue specifics.

I sketched out a base class for this approach:

module Crossword
  class Strategy
    def self.strategies
      @strategies ||= [ ]
    end

    def self.inherited(subclass)
      strategies << subclass
    end

    def self.load(dir = File.join(__dir__, "strategies"))
      Dir.glob("#{dir}/*.rb") do |path|
        Kernel.load path
      end
    end

    def self.match(clue)
      strategies.each do |subclass|
        strategy = subclass.new(clue)
        return strategy if strategy.match?
      end
      nil
    end

    def initialize(clue)
      @clue       = clue
      @match_data = nil
    end

    attr_reader :clue, :match_data
    private     :clue, :match_data

    def match?
      @match_data = self.class.const_get(:CLUE_REGEX).match(clue)
    end

    def answer
      fail NotImplementedError, "Strategies must override answer()"
    end
  end
end

The four class methods all work together to support this simple process:

  1. I plan to define subclasses of Strategy
  2. And I'll throw all of them in the same directory
  3. So dynamically load everything in there
  4. Then search through all subclasses to find one that matches the current clue

The two instance methods handle matching and generating answers for clues. Originally I considered forcing subclasses to implement both. However a glance at the clues made me feel that at least some could be matched with a simple regex. I decided to make that the default behavior. Subclasses can still override match?() if they need more complex logic.

My use of a constant for the regex is a little tricky. Had I used a bare constant, Ruby would have resolved it based on the current scope. Put simply, it would have searched for a constant in this class. (That's an over simplification, but true enough here.) It would not find a constant defined in a Strategy subclass, where I wanted to put them. Because of that, I look up the constant manually, using Ruby's reflection capabilities. This code will find a constant in the subclass. The truth is, it will fail with an error if the subclass does not define the constant, but I expect match?() to be overriden in those cases.

Plugging these strategies into the Solver is just four lines of code, explained inline:

require_relative "board"
require_relative "strategy"  # 1:  load the dependency

module Crossword
  class Solver
    # ...

    private

    # ...

    def solve_clues
      Strategy.load  # 2:  dynamically load all available strategies
      %i[across down].each do |direction|
        count = gets.to_i
        count.times do
          row, column, clue = gets.split(" ", 3).map.with_index { |str, i|
            i < 2 ? str.to_i : str.strip
          }
          if (strategy = Strategy.match(clue))  # 3:  find a match for the clue
            # 4:  record the answer on the Board:
            board.send("record_#{direction}", row, column, strategy.answer)
          else
            warn "No match:  #{clue}"
          end
        end
      end
    end

    # ...
  end
end

As you can see, this required the Board to gain some new methods as well, for recording answers:

module Crossword
  class Board
    # ...

    def record_across(row, column, answer)
      answer.downcase.chars.each_with_index do |char, i|
        record(row, column + i, char)
      end
    end

    def record_down(row, column, answer)
      answer.downcase.chars.each_with_index do |char, i|
        record(row + i, column, char)
      end
    end

    # ...

    private

    def record(row, column, char)
      square = squares[row][column]

      fail "Outside of board"  if square.nil?
      fail "Outside of answer" if square == "."
      fail "Answer mismatch"   if !["#", char].include?(char)

      square.replace(char)
    end
  end
end

I did notice that record_across() and record_down() had some duplication that could be removed with some yield magic. This is a pretty simple case though and I decided to let it ride.

You'll noticed that I cranked the paranoia up a bit in record(). Placing letters in a crossword puzzle is a great time to notice several potential problems like overruns or answers that don't agree. I don't know if I would have added all of these guard clauses in the presence of a strong test suite, but I liked seeing them here in the end. I think the extra checks here, where they mattered most, helped me write the rest of the code with more confidence.

This felt like enough infrastructure to start solving actual clues.

I decided to start with the Pokémon questions. Glancing through the lists, I saw two different flavors of clues. So I went hunting for a data file that could answer both types of questions in an easy-to-read format. Having found that, I needed some code to parse the data and allow me to search through it:

require "csv"

module Crossword
  class PokemonList
    def self.loaded
      @loaded ||= new
    end

    def self.method_missing(method, *args, &block)
      if loaded.respond_to?(method)
        loaded.send(method, *args, &block)
      else
        super
      end
    end

    def initialize(path = File.join(__dir__, *%w[.. .. data pokemon.csv]))
      @pokemon = CSV.read(path, headers: true, header_converters: :symbol)
    end

    attr_reader :pokemon
    private     :pokemon

    def find_by_number(number)
      pokemon.find { |record| record[:id] == number }
    end
  end
end

I do the actual parse in initialize(), just handing off to the standard CSV library. The class methods setup a system of fowarding to the instance methods for searching through a loaded data set. The only search I needed to drop the first set of clues was an ability to find a Pokémon by number.

With this database ready to go, the actual strategy for solving the clue is almost a non-event:

require_relative "../pokemon_list"

module Crossword
  class PokemonNumber < Strategy
    CLUE_REGEX = /\APokemon with number #(?<number>\d+)\z/

    def answer
      PokemonList.find_by_number(match_data[:number])[:identifier]
    end
  end
end

You can finally see how my stategies pay off here. I write a regex to match the clue, capturing the key bits, then use those bits to find the right answer when called upon to do so.

It was time to check in on those feedback loops:

No match:  Pokemon which evolves from Oddish
No match:  Pokemon which evolves from Spearow
No match:  Chemical element with the symbol Au
…
........#.#...#.#.#####.......
......######..#.#..#...g###...
..abra..#.#..porygon...r......
....#.######..#.#..#.d.i......
....#...#.#...#.#.###i#m......
....#...#.########.#.t.e####..
.m######....#...#..#.t.r......
.e..#.#.#######......o....#...
.w....#.....#.#.......#...#...
.t....#.##a#####..#########...
.w#####.#.r.#.#.#.....#...#...
.o....#.#.t.#.#.#.hitmonchan..
.....#####i##.#.#.....#...#...
.m..#.#.#.c...#.#########.#...
#a####...#u##.#.......#.#.#...
.c..#.#.#.n....############..#
.h..######o#.s....#...#.#....#
.o..#.#.#...#a###.charmander.#
.p..#.#.#..#.n....#.....#.#..#
...#########.dragonite..#.#..#
....#.#.#..#.s....#.#.primeape
......#.#..#.lickitung..#.#..#
.#.w..#.#..#.a..#.#.#...#.#...
.#.e.......#.s..#.#.#.#...#...
###e#.growlithe.#.#.#.#...#.k.
.#.d..#.#..#....#..########.r.
.##l###.#..#######..#.#...#.a.
...e..#.#.......#...#.#.....b.
....######...pinsir...#.....b.
......#.#.............######y.

There are two things to notice here:

  • There are still plenty of unsolved clues, including another flavor of Pokémon clues
  • The board is starting to fill in and we even see some clues sharing squares

Step 3: The Real Problem

The rest of this puzzle was just about writing strategies to solve the various clues and it went pretty smoothly. I'm not going to show all of them. Checkout the repository on GitHub, if you're curious.

I'll show you the most challenging clue type though, just for fun. Check out this one:

No match:  Last name of the next President of the United States after Harry S. Truman

This clue is still just a lookup in a database, but names are involved. Names can be quite a curve ball. Trust the guy named James Edward Gray II and married to Dana Ann Leslie Gray. There's not a form in the world that handles both of our names gracefully. I knew enough to fear this part, because there was very little chance of the database perfectly matching the clues. Given that, I built an object just to handle name comparisons:

module Crossword
  class Name
    def initialize(full_name)
      @full_name = full_name
    end

    attr_reader :full_name
    private     :full_name

    def first_name
      full_name[/\A\w+/]
    end

    def last_name
      full_name[/\w+\z/]
    end

    def ==(other)
      first_name == other.first_name &&
      last_name  == other.last_name
    end
  end
end

Now I'll be honest: I cheated here. A lot. I didn't need to worry about suffixes because my database didn't include them and I didn't need to consider middle names or initials in the comparisons for these easy clues. (That's not true in the hard input!) I was lucky that both lists were small enough for me to visually scan for issues like that. It saved me a fair bit of code.

The president database is similar to what you saw for Pokémon:

require "csv"

require_relative "name"

module Crossword
  class PresidentList
    def self.loaded
      @loaded ||= new
    end

    def self.method_missing(method, *args, &block)
      if loaded.respond_to?(method)
        loaded.send(method, *args, &block)
      else
        super
      end
    end

    def initialize(path = File.join(__dir__, *%w[.. .. data presidents.csv]))
      @presidents = CSV.read(
        path,
        headers:           true,
        header_converters: :symbol,
        converters:        lambda { |field, field_info|
          if field_info.header == :president
            Name.new(field.sub(" (2nd term)", ""))
          else
            field
          end
        }

      )
    end

    attr_reader :presidents
    private     :presidents

    def find_by_name(name)
      presidents.find { |record| record[:president] == name }
    end

    def find_by_number(number)
      presidents.find { |record| record[:presidency] == number }
    end
  end
end

You can see that I used the same class method forwarding trick here. In fact, I used it in all three databases that I built to solve this problem. Were I not just playing around here, I'm pretty sure I would have felt compelled to extract the common code into a superclass, but I knew there was almost no chance I would need to change it during this exercise. Copy and paste coding turned out to be good enough for this case.

The main difference in the load here is that I had CSV convert the names into Name objects as they were read in. This allows for easier comparisons later on.

The two instance methods are the searches I needed to find some president by name and then look for a later president. Here's the strategy that actually does that:

require_relative "../president_list"
require_relative "../name"

module Crossword
  class PresidentAfter < Strategy
    CLUE_REGEX = /
      \A (?<name>First|Last) \s+ name \s+ of \s+ the \s+
      next \s+ President \s+ of \s+ the \s+ United \s+ States \s+ after \s+
      (?<before>.+) \z
    /x

    def answer
      id  = PresidentList.find_by_name(
        Name.new(match_data[:before])
      )[:presidency].to_i + 1
      PresidentList
        .find_by_number(id.to_s)[:president]
        .send("#{match_data[:name].downcase}_name")
    end
  end
end

These queries pick up some oddities of their individual databases, that could definitely be cleaned up, if this wasn't contest code. That said, the strategy code is pretty short even for this trickier clue. I'm not feeling much pain.

Conclusion

Overall, I really liked the process I slipped into here:

  1. Setup feedback loops
  2. Make the problem easy to solve
  3. Solve the problem

Testing can be a viable way to handle feedback loops, but it's not the only available solution. Sure, I had a couple of times during this experiment when I introduced a bug that I found on the next run, but there weren't many of those and they were easily dealt with. I didn't feel super hindered in this approach. I need to keep exploring this idea though, so I know what the limitations are.

In the end, solving just this easier portion of the problem took me roughly three and a half hours. That's not fair though, because I was able to do a little thinking about how I should handle it before I started.

Comments (0)
Leave a Comment (using GitHub Flavored Markdown)

Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

Ajax loader