My Projects

The projects that I have spent significant time on.

18

NOV
2007

Ghost Wheel Example

There has been a fair bit of buzz around the Treetop parser in the Ruby community lately. Part of that is fueled by the nice screencast that shows off how to use the parser generator.

It doesn't get talked about as much, but I wrote a parser generator too, called Ghost Wheel. Probably the main reason Ghost Wheel doesn't receive much attention yet is that I have been slow in getting the documentation written. Given that, I thought I would show how the code built in the Treetop screencast translates to Ghost Wheel:

#!/usr/bin/env ruby -wKU

require "rubygems"
require "ghost_wheel"

# define a parser using Ghost Wheel's Ruby DSL
RubyParser    = GhostWheel.build_parser do
  rule( :additive,
        alt( seq( :multiplicative,
                  :space,
                  :additive_op,
                  :space,
                  :additive ) { |add| add[0].send(add[2], add[-1])},
             :multiplicative ) )
  rule(:additive_op, alt("+", "-"))

  rule( :multiplicative,
        alt( seq( :primary,
                  :space,
                  :multiplicative_op,
                  :space,
                  :multiplicative ) { |mul| mul[0].send(mul[2], mul[-1])},
             :primary ) )
  rule(:multiplicative_op, alt("*", "/"))

  rule(:primary, alt(:parenthized_additive, :number))
  rule( :parenthized_additive,
        seq("(", :space, :additive, :space, ")") { |par| par[2] } )
  rule(:number, /[1-9][0-9]*|0/) { |n| Integer(n) }

  rule(:space, /\s*/)
  parser(:exp, seq(:additive, eof) { |e| e[0] })
end

# define a parser using Ghost Wheel's grammar syntax
GrammarParser = GhostWheel.build_parser %q{
  additive             =  multiplicative space additive_op space additive
                          { ast[0].send(ast[2], ast[-1]) }
                       |  multiplicative
  additive_op          =  "+" | "-"

  multiplicative       =  primary space multiplicative_op space multiplicative
                          { ast[0].send(ast[2], ast[-1])}
                       |  primary
  multiplicative_op    =  "*" | "/"

  primary              = parenthized_additive | number
  parenthized_additive =  "(" space additive space ")" { ast[2] }
  number               =  /[1-9][0-9]*|0/ { Integer(ast) }

  space                =  /\s*/
  exp                  := additive EOF { ast[0] }
}

if __FILE__ == $PROGRAM_NAME
  require "test/unit"

  class TestArithmetic < Test::Unit::TestCase
    def test_paring_numbers
      assert_parses         "0"
      assert_parses         "1"
      assert_parses         "123"
      assert_does_not_parse "01"
    end

    def test_parsing_multiplicative
      assert_parses "1*2"
      assert_parses "1 * 2"
      assert_parses "1/2"
      assert_parses "1 / 2"
    end

    def test_parsing_additive
      assert_parses "1+2"
      assert_parses "1 + 2"
      assert_parses "1-2"
      assert_parses "1 - 2"

      assert_parses "1*2 + 3 * 4"
    end

    def test_parsing_parenthized_expressions
      assert_parses "1 * (2 + 3) * 4"
    end

    def test_parse_results
      assert_correct_result "0"
      assert_correct_result "1"
      assert_correct_result "123"

      assert_correct_result "1*2"
      assert_correct_result "1 * 2"
      assert_correct_result "1/2"
      assert_correct_result "1 / 2"

      assert_correct_result "1+2"
      assert_correct_result "1 + 2"
      assert_correct_result "1-2"
      assert_correct_result "1 - 2"

      assert_correct_result "1*2 + 3 * 4"
      assert_correct_result "1 * (2 + 3) * 4"
    end

    private

    PARSERS = [RubyParser, GrammarParser]

    def assert_parses(input)
      PARSERS.each do |parser|
        assert_nothing_raised(GhostWheel::FailedParseError) do
          parser.parse(input)
        end
      end
    end

    def assert_does_not_parse(input)
      PARSERS.each do |parser|
        assert_raises(GhostWheel::FailedParseError) { parser.parse(input) }
      end
    end

    def assert_correct_result(input)
      PARSERS.each { |parser| assert_equal(eval(input), parser.parse(input)) }
    end
  end
end

The primary differences you should note from the above code and Treetop are:

  • I show two different ways to build the parser using Ghost Wheel: using a Ruby DSL and using a grammar syntax. I prefer the grammar syntax in this and, in fact, most cases. The Ruby DSL can be handy when you want the AST transformations to be true closures though.
  • Ghost Wheel builds on your regular expression knowledge. Note that the grammar syntax is regex-like and you can even match Regexp literals.
  • Ghost Wheel's AST transformations are more Lispish compared to Treetop's very object oriented syntax. I think they both have strengths in certain scenarios.

There's still plenty I want to do with Ghost Wheel, but maybe this will begin the process of getting the word out about it. Feel free to post questions here.

Comments (7)
  1. Vish
    Vish November 29th, 2007 Reply Link

    James,

    Could this parser be used to check a very limited English vocabulary given a full superset of nouns, verbs and prepositions?

    For illustration, I'd greatly appreciate if you could show how this might be used to generate a parser which verifies that

    "Bob and John go to work" is accepted and so are "John goes to work" and "Bob goes to home" but the following are rejected: "Bob and John goes to work", "John go to work", "John goes to", "Bob John work".

    The superset being-

    [Prepositions: to, and], [Nouns: Bob, John, work, home], [Verbs: go, come, goes, comes]

    I'm not expecting semantic/colloquial correctness so "Bob and work go to John" is acceptable and "Bob goes to home" is acceptable in place of "Bob goes home". (Of course it would be a plus to have that too.)

    My question is motivated by the fact that many DSLs have a very limited vocabulary.

    Thanks in advance

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
    2. James Edward Gray II
      James Edward Gray II November 30th, 2007 Reply Link

      I'm sure you can put something together as long as you are willing to spend the effort, yes. Here's a trivial example that passes your tests, just to get you started:

      #!/usr/bin/env ruby -wKU
      
      require "rubygems"
      require "ghost_wheel"
      
      English = GhostWheel.build_parser %q{
        space       =  /\s+/
      
        noun        =  'Bob' | 'John' | 'work' | 'home'
        nouns       =  noun space 'and' space noun | noun
      
        verb        =  'go'   | 'come'
        verb_plural =  'goes' | 'comes'
      
        preposition =  'to' space
      
        statement   =  noun  space verb_plural space preposition? noun
                    |  nouns space verb        space preposition? noun
        sentence    := statement EOF
      }
      
      if __FILE__ == $PROGRAM_NAME
        require "test/unit"
      
        class TestEnglishGrammar < Test::Unit::TestCase
          def test_good_grammar
            assert_parses "Bob and John go to work"
            assert_parses "John goes to work"
            assert_parses "Bob goes home"
          end
      
          def test_bad_grammar
            assert_does_not_parse "Bob and John goes to work"
            assert_does_not_parse "John goes to"
            assert_does_not_parse "Bob John work"
          end
      
          private
      
          def assert_parses(input)
            assert_nothing_raised(GhostWheel::FailedParseError) do
              English.parse(input)
            end
          end
      
          def assert_does_not_parse(input)
            assert_raises(GhostWheel::FailedParseError) { English.parse(input) }
          end
        end
      end
      
      1. Reply (using GitHub Flavored Markdown)

        Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

        Ajax loader
  2. Phil
    Phil January 17th, 2008 Reply Link

    I wonder why you chose the name "Ghost Wheel" as Ruby 1.9's Regexp engine has the same name (in Japanese). Seems like it could cause potential confusion. Was it intentional?

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
    2. James Edward Gray II
      James Edward Gray II January 17th, 2008 Reply Link

      It was intentional, yes. Ghost Wheel is a parser generator designed to build on your existing regular expression knowledge, so the related names seemed like a good fit to me.

      1. Reply (using GitHub Flavored Markdown)

        Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

        Ajax loader
  3. Robert Dober
    Robert Dober January 1st, 2009 Reply Link

    James
    I had some troubles of finding a starting point for what seems a hack of an interesting project. Actually it would be very helpful if you could just added this example into a README of the ghost_wheel package and or on the Rubyforge Project Homepage.
    Thanx
    Robert

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
    2. James Edward Gray II
      James Edward Gray II January 3rd, 2009 Reply Link

      Done. I know it's not much yet, but it at least guides you to two full examples.

      1. Reply (using GitHub Flavored Markdown)

        Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

        Ajax loader
      2. Robert Dober
        Robert Dober January 16th, 2009 Reply Link

        Thanks a lot!

        1. Reply (using GitHub Flavored Markdown)

          Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

          Ajax loader
Leave a Comment (using GitHub Flavored Markdown)

Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

Ajax loader