What can we learn by using regular expression to do what it cannot do?

SEP
2014

### A Regex Can't Match Balanced Parentheses

Can we do math with regular expressions?

```#!/usr/bin/env ruby -w

def build_preparation_regex(number_regex, ops)
%r{
(?<number>             #{number_regex}                                   ){0}
(?<operator>           [#{ops.map(&Regexp.method(:escape)).join}]        ){0}
(?<term_operator_term> \g<term> \s* \g<operator> \s* \g<term>            ){0}
(?<term>               \g<number> | \( \s* \g<term_operator_term> \s* \) ){0}

\g<term_operator_term>(?=\s*\z|[^)])
}x
end

NUMBER_REGEX               = %r{
-?            # an optional minus
\d+           # an integer
(?: \. \d+)?  # an optional fractional bit
}x
PREPARE_MULT_AND_DIV_REGEX = build_preparation_regex(NUMBER_REGEX, %w[* /])
PREPARE_ADD_AND_SUB_REGEX  = build_preparation_regex(NUMBER_REGEX, %w[* / + -])
CHECK_REGEX                = %r{
\A                   # the start of the expression
(?<term>             # a term, which is:
#{NUMBER_REGEX}    # a number
|                  # or
\( \s*             # a parenthesized group of
\g<term>         # a term
\s* [*/+\-] \s*  # an operator
\g<term>         # and another term
\s* \)             # the end of the parenthesized group
)
\z                   # the end of the expression
}x
MATH_REGEX                 = %r{
\( \s*
(?<left>     #{NUMBER_REGEX} )
\s*
(?<operator> [*/+\-]         )
\s*
(?<right>    #{NUMBER_REGEX} )
\s* \)
}x

verbose = ARGV.delete("-v")
problem = ARGV.first.strip or abort "USAGE:  #{\$PROGRAM_NAME} MATH_EXPRESSION"
steps   = [ ]

loop do
steps << problem.dup if verbose
problem.sub!(preparation) { |term| "(#{term})" } or break
end
end

problem =~ CHECK_REGEX or abort "Error:  Invalid expression"

solution = problem.dup
loop do
steps << solution.dup if verbose
solution.sub!(MATH_REGEX) {
\$~[:left].to_f.public_send(\$~[:operator], \$~[:right].to_f)
} or break
end

puts steps.uniq[0..-2] if verbose
puts solution.sub(/\.0+\z/, "")
```
``````# \$ ruby math.rb '2 + 3 * 4 / 6 - 5 + 1'
# 0
# \$ ruby math.rb -v '2 + 3 * 4 / 6 - 5 + 1'
# 2 + 3 * 4 / 6 - 5 + 1
# 2 + (3 * 4) / 6 - 5 + 1
# 2 + ((3 * 4) / 6) - 5 + 1
# (2 + ((3 * 4) / 6)) - 5 + 1
# ((2 + ((3 * 4) / 6)) - 5) + 1
# (((2 + ((3 * 4) / 6)) - 5) + 1)
# (((2 + (12.0 / 6)) - 5) + 1)
# (((2 + 2.0) - 5) + 1)
# ((4.0 - 5) + 1)
# (-1.0 + 1)
# 0
# \$ ruby math.rb -v '-2.1 * 5'
# -2.1 * 5
# (-2.1 * 5)
# -10.5
# \$ ruby math.rb -v '2 +'
# Error:  Invalid expression
``````

Nice code! I like the approach of defining regexp parts in `build_preparation_regexp` and also the recursive thing (I wasn't aware of this possibility!)

1. ##### Reply (using GitHub Flavored Markdown)

Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

``````\$ ruby math.rb -v '1 + 2 * (3 + 4)'
1 + 2 * (3 + 4)
(1 + 2) * (3 + 4)
((1 + 2) * (3 + 4))
(3.0 * (3 + 4))
(3.0 * 7.0)
21
``````

Is this a known limitation?

1. ##### Reply (using GitHub Flavored Markdown)

Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.