Deadly Regular Expressions

What can we learn by using regular expression to do what it cannot do?

25

SEP
2014

Regex Code Equivalency

#!/usr/bin/env ruby -w

Name = "Gray, James"

!!(Name =~ /\AGray/)      # => true
Name.start_with?("Gray")  # => true

!!(Name =~ /James\z/)    # => true
Name.end_with?("James")  # => true

!!(Name =~ /Dana/)     # => false
Name.include?("Dana")  # => false

!!(Name =~ /\A\z/)  # => false
Name.empty?         # => false

!!(Name =~ /\AGray, James\z/)  # => true
Name == "Gray, James"          # => true

!!(Name =~ /\A(?:Gray, James|Gray, Dana)\z/)  # => true
["Gray, James", "Gray, Dana"].include?(Name)  # => true

Name =~ /\A\w+/ && $&  # => "Gray"
Name[/\A\w+/]          # => "Gray"

Name =~ /\A(\w+),\s*(\w+)\z/ && $2  # => "James"
Name[/\A(\w+),\s*(\w+)\z/, 2]       # => "James"

Name =~ /\A(?<last>\w+),\s*(?<first>\w+)\z/ && $~[:first]  # => "James"
Name[/\A(?<last>\w+),\s*(?<first>\w+)\z/, :first]          # => "James"

Name.scan(/^.*\n?/)  # => ["Gray, James"]
Name.lines           # => ["Gray, James"]

Name.scan(/./m)  # => ["G", "r", "a", "y", ",", " ", "J", "a", "m", "e", "s"]
Name.chars       # => ["G", "r", "a", "y", ",", " ", "J", "a", "m", "e", "s"]

Name.gsub(/[aeiou]/, "")  # => "Gry, Jms"
Name.delete("aeiou")      # => "Gry, Jms"

Name.gsub(/[aeiou]/, "X") # => "GrXy, JXmXs"
Name.tr("aeiou", "X")     # => "GrXy, JXmXs"

# For the destructive operations that follow you can drop the `dup()` and
# switch `sub()` to `sub!()`, as long as you don't care about the return value.

Name.sub(/(?=,)/, " II")                 # => "Gray II, James"
Name.dup.insert(Name.index(","), " II")  # => "Gray II, James"

Name.sub(/\A/, "Name:  ")    # => "Name:  Gray, James"
Name.dup.prepend("Name:  ")  # => "Name:  Gray, James"

Name.sub(/\A.*\z/m, "Gray, Dana")  # => "Gray, Dana"
Name.dup.replace("Gray, Dana")     # => "Gray, Dana"

Name.sub(/\A.*\z/m, "")  # => ""
Name.dup.clear           # => ""



Spacey = "\tsome    space\r\n"

Spacey.sub(/\A\s+/, "")  # => "some    space\r\n"
Spacey.lstrip            # => "some    space\r\n"

Spacey.sub(/\s+\z/, "")  # => "\tsome    space"
Spacey.rstrip            # => "\tsome    space"

Spacey.sub(/\A\s*(.+?)\s*\z/m, '\1')  # => "some    space"
Spacey.strip                          # => "some    space"

Spacey.sub(/(?:\r?\n|\r)\z/m, "")  # => "\tsome    space"
Spacey.chomp                       # => "\tsome    space"

Spacey.sub(/(?:\r\n|.)\z/m, "")  # => "\tsome    space"
Spacey.chop                      # => "\tsome    space"

Spacey.gsub(/ +/, " ")  # => "\tsome space\r\n"
Spacey.squeeze(" ")     # => "\tsome space\r\n"
Comments (2)
  1. Guilherme Simões
    Guilherme Simões September 25th, 2014 Reply Link

    Benchmarking these would be pretty interesting. Still, I'm pretty sure String's native methods are all superior to regexes, performance-wise.

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
    2. James Edward Gray II
      James Edward Gray II September 25th, 2014 Reply Link

      I would say the String methods are probably preferable in most cases.

      1. Reply (using GitHub Flavored Markdown)

        Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

        Ajax loader
Leave a Comment (using GitHub Flavored Markdown)

Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

Ajax loader