-
22
SEP
2014A Regex Can't Match Balanced Parentheses
Can we do math with regular expressions?
#!/usr/bin/env ruby -w def build_preparation_regex(number_regex, ops) %r{ (?<number> #{number_regex} ){0} (?<operator> [#{ops.map(&Regexp.method(:escape)).join}] ){0} (?<term_operator_term> \g<term> \s* \g<operator> \s* \g<term> ){0} (?<term> \g<number> | \( \s* \g<term_operator_term> \s* \) ){0} \g<term_operator_term>(?=\s*\z|[^)]) }x end NUMBER_REGEX = %r{ -? # an optional minus \d+ # an integer (?: \. \d+)? # an optional fractional bit }x PREPARE_MULT_AND_DIV_REGEX = build_preparation_regex(NUMBER_REGEX, %w[* /]) PREPARE_ADD_AND_SUB_REGEX = build_preparation_regex(NUMBER_REGEX, %w[* / + -]) CHECK_REGEX = %r{ \A # the start of the expression (?<term> # a term, which is: #{NUMBER_REGEX} # a number | # or \( \s* # a parenthesized group of \g<term> # a term \s* [*/+\-] \s* # an operator \g<term> # and another term \s* \) # the end of the parenthesized group ) \z # the end of the expression }x MATH_REGEX = %r{ \( \s* (?<left> #{NUMBER_REGEX} ) \s* (?<operator> [*/+\-] ) \s* (?<right> #{NUMBER_REGEX} ) \s* \) }x verbose = ARGV.delete("-v") problem = ARGV.first.strip or abort "USAGE: #{$PROGRAM_NAME} MATH_EXPRESSION" steps = [ ] [PREPARE_MULT_AND_DIV_REGEX, PREPARE_ADD_AND_SUB_REGEX].each do |preparation| loop do steps << problem.dup if verbose problem.sub!(preparation) { |term| "(#{term})" } or break end end problem =~ CHECK_REGEX or abort "Error: Invalid expression" solution = problem.dup loop do steps << solution.dup if verbose solution.sub!(MATH_REGEX) { $~[:left].to_f.public_send($~[:operator], $~[:right].to_f) } or break end puts steps.uniq[0..-2] if verbose puts solution.sub(/\.0+\z/, "")
-
19
SEP
2014"You can't parse [X]HTML with regex."
The only explanation I'll give for the following code it to provide this link to my favorite Stack Overflow answer.
#!/usr/bin/env ruby -w require "open-uri" URL = "http://stackoverflow.com/questions/1732348/" + "regex-match-open-tags-except-xhtml-self-contained-tags" PARSER = %r{ (?<doctype_declaration> <!DOCTYPE\b (?<doctype> [^>]* ) > ){0} (?<comment> <!-- .* --> ){0} (?<script_tag> < \s* (?<tag_name> script ) \s* (?<attributes> [^>]* > ) (?<script> .*? ) < \s* / \s* script \s* > ){0} (?<self_closed_tag> < \s* (?<tag_name> \w+ ) \s* (?<attributes> [^>]* / \s* > ) ){0} (?<unclosed_tag> < \s* (?<tag_name> link | meta | br | input | hr | img ) \b \s* (?<attributes> [^>]* > ) ){0} (?<open_tag> < \s* (?<tag_name> \w+ ) \s* (?<attributes> [^>]* > ) ){0} (?<close_tag> < \s* / \s* (?<tag_name> \w+ ) \s* > ){0} (?<attribute> (?<attribute_name> [-\w]+ ) (?: \s* = \s* (?<attribute_value> "[^"]*" | '[^']*' | [^>\s]+ ) )? \s* ){0} (?<attribute_list> \g<attribute> (?= [^>]* > \z ) # attributes keep a trailing > to disambiguate from text ){0} (?<text> (?! [^<]* /?\s*> \z ) # a guard to prevent this from parsing attributes [^<]+ ){0} \G (?: \g<doctype_declaration> | \g<comment> | \g<script_tag> | \g<self_closed_tag> | \g<unclosed_tag> | \g<open_tag> | \g<attribute_list> | \g<close_tag> | \g<text> ) \s* }mix def parse(html) stack = [{attributes: [ ], contents: [ ], name: :root}] loop do html.sub!(PARSER, "") or break if $~[:doctype_declaration] add_to_tree(stack.last, "DOCTYPE", $~[:doctype].strip) elsif $~[:script_tag] add_to_stack(stack, $~[:tag_name], $~[:attributes], $~[:script]) elsif $~[:self_closed_tag] || $~[:unclosed_tag] || $~[:open_tag] add_to_stack(stack, $~[:tag_name], $~[:attributes], "", $~[:open_tag]) elsif $~[:close_tag] stack.pop elsif $~[:text] stack.last[:contents] << $~[:text] end end stack.pop end def add_to_tree(branch, name, value) if branch.include?(name) branch[name] = [branch[name]] unless branch[name].is_a?(Array) branch[name] << value else branch[name] = value end end def add_to_stack(stack, tag_name, attributes_html, contents, open = false) tag = { attributes: parse_attributes(attributes_html), contents: [contents].reject(&:empty?), name: tag_name } add_to_tree(stack.last, tag_name, tag) stack.last[:contents] << tag stack << tag if open end def parse_attributes(attributes_html) attributes = { } loop do attributes_html.sub!(PARSER, "") or break add_to_tree( attributes, $~[:attribute_name], ($~[:attribute_value] || $~[:attribute_name]).sub(/\A(["'])(.*)\1\z/, '\2') ) end attributes end def convert_to_bbcode(node) if node.is_a?(Hash) name = node[:name].sub(/\Astrike\z/, "s") "[#{name}]#{node[:contents].map { |c| send(__method__, c) }.join}[/#{name}]" else node end end html = open(URL, &:read).strip ast = parse(html) puts ast["html"]["body"]["div"] .find { |div| div[:attributes]["class"] == "container" }["div"] .find { |div| div[:attributes]["id"] == "content" }["div"]["div"] .find { |div| div[:attributes]["id"] == "mainbar" }["div"] .find { |div| div[:attributes]["id"] == "answers" }["div"] .find { |div| div[:attributes]["id"] == "answer-1732454" }["table"]["tr"] .first["td"] .find { |div| div[:attributes]["class"] == "answercell" }["div"]["p"] .first[:contents] .map(&method(:convert_to_bbcode)) # to reach a wider audience .join
-
26
JUN
2014IPSC 2014 Postmortem
I decided to give the Internet Problem Solving Contest (IPSC) a go this year, with a couple of friends. I've done it in the past and enjoyed it. I like how it only eats a few hours one day and I like how the variety in the problems they give you keeps things interesting.
That said, my performance in the IPSC this year is probably best described as, "Three strikes and you're out!" I did terrible.
I solved one very simple problem. I spent the rest of the contest chasing after a much harder challenge that I couldn't complete in the time allowed.
The worst part is that I made some silly mistakes that I've learned to avoid in the past. As penance, I offer up this article, mainly as a reminder to myself, but hopefully also as a tool that could save others from some of my folly.
Let's start with a simple mistake I made…
Not All Problems are Programming Problems
The IPSC does a great job each year of reminding us that some problems are trivial to solve without programming. It's a good thing they do too, because I seem to need a lot of reminding.
-
17
MAY
2014A Library in One Day
I was super inspired by Darius Kazemi's recent blog post on small projects, so I've been looking for ways to speed up my process.
Today, I tried an experiment: develop a library in one day. I wanted to go from an empty repository to a published gem that I could start using.
Topic
Obviously, I had to select a pretty simple idea to use. I wouldn't have time to do a huge project.
I think this may be the killer feature of this technique.
On one hand, you could argue that what I built may not be very library worthy. It's around 50 lines of code. It has ten specs and they really cover what it does. This isn't a complex beast and you could pretty easily hand roll a solution to replace it.
But in some ways that's the best part. I've dropped a 50 line pattern that I like down to a one line
Gemfile
include. I'm making it even easier for myself to get some mileage out of experimenting with this code. I can mix and match this new library with other small tools to build up the ecosystem that I want for a project. Plus, if it turns out to be something I regret, it's not like I'm tied down to a huge dependency when I go to rip it out. This thinking actually has me wanting to keep this library minimal, at least for now. -
28
JAN
2008Practical Ruby Projects
Practical Ruby Projects is a pretty poorly named title, but, luckily, that doesn't stop it from being a very strong book. The book actually turns out to be an exploring-the-Ruby-programming-language-by-example book. These aren't your trivial beginners-only tasks though. There's enough meat in these pages for the intermediate crowd to really get into.
Let me start by clarifying my earlier comment about the title. It's clear this book is named after the series it appears in, instead of the actual content it holds. There are lots of projects in the book and they are definitely written in Ruby, but Practical is not the word I would use to describe them. Fun, on the other hand, would be a great word. Beyond that, the code and concepts used in these projects is well worth studying. Just don't expect to find the typical (for Ruby) collection of Web programming tips inside. To me, that was a big plus. The title just misrepresents what's inside.
The projects you will find in the book include: MIDI music generation, SVG graphic building, pocket change simulations, a turn-based strategy game, a Mac OS X GUI, a genetic algorithms framework, as well as both a parser and interpreter for the Lisp programming language. While these projects obviously tackle subsets of each problem space, they go deep enough to serve as a solid introduction in each area. The author is also good at focusing on the more interesting aspects of each challenge and throwing in a few twists to keep your interests high.
-
7
SEP
2007Marcel at Play
At the preconference charity event for the first Lone Star Rubyconf, Marcel Molina, Jr gave one of the best talks I've ever heard at a conference. The entire talk was Marcel showing examples of pathological (his word, not mine) Ruby code. Not only did he show all these examples to a room full of people new to language at an event titled Intro to Ruby, but he actually made a case for these examples being proof positive Ruby is a reliable language.
This was a very unique approach to speeches in that we literally saw a Ruby master at play. While most speakers try to put our best foot forward, Marcel embraced the craziness and just had some fun. His attitude was infectious and really drove his point home.
We should all try to have more fun with our speeches. It's so enlightening to watch, for any audience. I hope I have the courage to try a similar talk in the future.
-
17
FEB
2007I believe in Ruby
For the contest and all you Bull Durham fans…
Well, I believe in blocks, iterators, closures, that everything should be an object, the power of reflection, garbage collection, exception handling, that multiple inheritance causes more problems than it solves. I believe interpreters should be totally free. I believe there ought to be a constitutional amendment outlawing pointers and verbose syntax. I believe in a strong standard library, green threads, that a language should trust the programmer rather than restrict his efforts and I believe in sheer fun of coding that truly is possible to achieve.