18
NOV2007
Ghost Wheel Example
There has been a fair bit of buzz around the Treetop parser in the Ruby community lately. Part of that is fueled by the nice screencast that shows off how to use the parser generator.
It doesn't get talked about as much, but I wrote a parser generator too, called Ghost Wheel. Probably the main reason Ghost Wheel doesn't receive much attention yet is that I have been slow in getting the documentation written. Given that, I thought I would show how the code built in the Treetop screencast translates to Ghost Wheel:
#!/usr/bin/env ruby -wKU
require "rubygems"
require "ghost_wheel"
# define a parser using Ghost Wheel's Ruby DSL
RubyParser = GhostWheel.build_parser do
rule( :additive,
alt( seq( :multiplicative,
:space,
:additive_op,
:space,
:additive ) { |add| add[0].send(add[2], add[-1])},
:multiplicative ) )
rule(:additive_op, alt("+", "-"))
rule( :multiplicative,
alt( seq( :primary,
:space,
:multiplicative_op,
:space,
:multiplicative ) { |mul| mul[0].send(mul[2], mul[-1])},
:primary ) )
rule(:multiplicative_op, alt("*", "/"))
rule(:primary, alt(:parenthized_additive, :number))
rule( :parenthized_additive,
seq("(", :space, :additive, :space, ")") { |par| par[2] } )
rule(:number, /[1-9][0-9]*|0/) { |n| Integer(n) }
rule(:space, /\s*/)
parser(:exp, seq(:additive, eof) { |e| e[0] })
end
# define a parser using Ghost Wheel's grammar syntax
GrammarParser = GhostWheel.build_parser %q{
additive = multiplicative space additive_op space additive
{ ast[0].send(ast[2], ast[-1]) }
| multiplicative
additive_op = "+" | "-"
multiplicative = primary space multiplicative_op space multiplicative
{ ast[0].send(ast[2], ast[-1])}
| primary
multiplicative_op = "*" | "/"
primary = parenthized_additive | number
parenthized_additive = "(" space additive space ")" { ast[2] }
number = /[1-9][0-9]*|0/ { Integer(ast) }
space = /\s*/
exp := additive EOF { ast[0] }
}
if __FILE__ == $PROGRAM_NAME
require "test/unit"
class TestArithmetic < Test::Unit::TestCase
def test_paring_numbers
assert_parses "0"
assert_parses "1"
assert_parses "123"
assert_does_not_parse "01"
end
def test_parsing_multiplicative
assert_parses "1*2"
assert_parses "1 * 2"
assert_parses "1/2"
assert_parses "1 / 2"
end
def test_parsing_additive
assert_parses "1+2"
assert_parses "1 + 2"
assert_parses "1-2"
assert_parses "1 - 2"
assert_parses "1*2 + 3 * 4"
end
def test_parsing_parenthized_expressions
assert_parses "1 * (2 + 3) * 4"
end
def test_parse_results
assert_correct_result "0"
assert_correct_result "1"
assert_correct_result "123"
assert_correct_result "1*2"
assert_correct_result "1 * 2"
assert_correct_result "1/2"
assert_correct_result "1 / 2"
assert_correct_result "1+2"
assert_correct_result "1 + 2"
assert_correct_result "1-2"
assert_correct_result "1 - 2"
assert_correct_result "1*2 + 3 * 4"
assert_correct_result "1 * (2 + 3) * 4"
end
private
PARSERS = [RubyParser, GrammarParser]
def assert_parses(input)
PARSERS.each do |parser|
assert_nothing_raised(GhostWheel::FailedParseError) do
parser.parse(input)
end
end
end
def assert_does_not_parse(input)
PARSERS.each do |parser|
assert_raises(GhostWheel::FailedParseError) { parser.parse(input) }
end
end
def assert_correct_result(input)
PARSERS.each { |parser| assert_equal(eval(input), parser.parse(input)) }
end
end
end
The primary differences you should note from the above code and Treetop are:
- I show two different ways to build the parser using Ghost Wheel: using a Ruby DSL and using a grammar syntax. I prefer the grammar syntax in this and, in fact, most cases. The Ruby DSL can be handy when you want the AST transformations to be true closures though.
- Ghost Wheel builds on your regular expression knowledge. Note that the grammar syntax is regex-like and you can even match
Regexp
literals. - Ghost Wheel's AST transformations are more Lispish compared to Treetop's very object oriented syntax. I think they both have strengths in certain scenarios.
There's still plenty I want to do with Ghost Wheel, but maybe this will begin the process of getting the word out about it. Feel free to post questions here.
Comments (7)
-
Vish November 29th, 2007 Reply Link
James,
Could this parser be used to check a very limited English vocabulary given a full superset of nouns, verbs and prepositions?
For illustration, I'd greatly appreciate if you could show how this might be used to generate a parser which verifies that
"Bob and John go to work" is accepted and so are "John goes to work" and "Bob goes to home" but the following are rejected: "Bob and John goes to work", "John go to work", "John goes to", "Bob John work".
The superset being-
[Prepositions: to, and], [Nouns: Bob, John, work, home], [Verbs: go, come, goes, comes]
I'm not expecting semantic/colloquial correctness so "Bob and work go to John" is acceptable and "Bob goes to home" is acceptable in place of "Bob goes home". (Of course it would be a plus to have that too.)
My question is motivated by the fact that many DSLs have a very limited vocabulary.
Thanks in advance
-
I'm sure you can put something together as long as you are willing to spend the effort, yes. Here's a trivial example that passes your tests, just to get you started:
#!/usr/bin/env ruby -wKU require "rubygems" require "ghost_wheel" English = GhostWheel.build_parser %q{ space = /\s+/ noun = 'Bob' | 'John' | 'work' | 'home' nouns = noun space 'and' space noun | noun verb = 'go' | 'come' verb_plural = 'goes' | 'comes' preposition = 'to' space statement = noun space verb_plural space preposition? noun | nouns space verb space preposition? noun sentence := statement EOF } if __FILE__ == $PROGRAM_NAME require "test/unit" class TestEnglishGrammar < Test::Unit::TestCase def test_good_grammar assert_parses "Bob and John go to work" assert_parses "John goes to work" assert_parses "Bob goes home" end def test_bad_grammar assert_does_not_parse "Bob and John goes to work" assert_does_not_parse "John goes to" assert_does_not_parse "Bob John work" end private def assert_parses(input) assert_nothing_raised(GhostWheel::FailedParseError) do English.parse(input) end end def assert_does_not_parse(input) assert_raises(GhostWheel::FailedParseError) { English.parse(input) } end end end
-
-
I wonder why you chose the name "Ghost Wheel" as Ruby 1.9's Regexp engine has the same name (in Japanese). Seems like it could cause potential confusion. Was it intentional?
-
It was intentional, yes. Ghost Wheel is a parser generator designed to build on your existing regular expression knowledge, so the related names seemed like a good fit to me.
-
-
James
I had some troubles of finding a starting point for what seems a hack of an interesting project. Actually it would be very helpful if you could just added this example into a README of the ghost_wheel package and or on the Rubyforge Project Homepage.
Thanx
Robert-
Done. I know it's not much yet, but it at least guides you to two full examples.
-
Thanks a lot!
-
-