Ruby Voodoo

Deep dives into random corners of my favorite programming language.

2

OCT
2008

Working With Multiline Strings

I imagine most Rubyists are aware that Ruby has "heredocs," but do you really know all they can do? Let's find out.

A "here document" is a literal syntax for a multiline String. In the most basic form, they look like this:

p <<END_HEREDOC
This is a
  multiline,
as is String!
END_HEREDOC
# >> "This is a\n  multiline,\nas is String!\n"

The <<NAME syntax introduces the heredoc, but it actually begins at the start of the following line. It continues until NAME occurs again, at the beginning of a line. Note the trailing newline in the example above. All of the data between start and finish is packaged up into a String and dropped in where the original <<NAME designator appeared.

There are some important details in that description, namely that the String begins on the next line and that it's inserted where the heredoc was started. This means that the rest of the line where the heredoc is started can have normal Ruby code (though your editor may syntax highlight it badly):

p <<END_SQL.gsub(/\s+/, " ").strip
SELECT * FROM     users
         ORDER BY users.id DESC
END_SQL
# >> "SELECT * FROM users ORDER BY users.id DESC"

The spacing in the above example was to make it easier for a human to understand, but I use gsub() and strip() to normalize the actual String. You can do that since the heredoc doesn't begin until the next line.

Taking it one step further, the content of the rest of the line can include another heredoc. The second one will begin on the line after the first ends. This continues on down for however many you care to make:

def send_messages(*messages)
  messages.each { |m| p m }
end

send_messages(<<END_MESSAGE_ONE, <<END_MESSAGE_TWO)
This is message one.

...
END_MESSAGE_ONE
Another message.

....
END_MESSAGE_TWO
# >> "This is message one.:\n\n...\n"
# >> "Another message.\n\n...\n"

Another interesting thing to know about heredocs is that they are double-quoted Strings by default. You can use any escapes allowed there as well as interpolation:

p <<DOUBLE_QUOTED
Tricks:
\tindented
\t#{100 + 23}
DOUBLE_QUOTED
# >> "Tricks:\n\tindented\n\t123\n"

However, if you would prefer single-quoted behavior, you can just surround the heredoc name with single quotes:

p <<'SINGLE_QUOTED'
Tricks:
\tindented
\t#{100 + 23}
SINGLE_QUOTED
# >> "Tricks:\n\\tindented\n\\t\#{100 + 23}\n"

There's one more trick with regard to heredoc syntax. If you begin with <<-NAME, the end marker can be indented on the line it appears on. This is mostly useful when you want to inline some code, for example:

module HateMacro
  def self.generate_hate(target)
    module_eval <<-END_RUBY
      def self.hate_#{target}
        puts "#{target.to_s.capitalize} sucks!"
      end
    END_RUBY
  end
end

HateMacro.generate_hate(:emacs)  # Just for Jim Weirich!
HateMacro.hate_emacs
# >> Emacs sucks!

That String of code will have a bunch of whitespace at the beginning of each line. The space before the end marker is not counted though. This doesn't affect the code of course, but you need to keep it in mind for other content.

A final point of interest that may be of value to TextMate users: TextMate will properly syntax highlight the contents of <<-SQL, <<-HTML, and <<-CODE_FOR_EVAL (as Ruby) heredocs. You do need to use the indented form though even if you don't indent the content.

Hopefully all of this gives you some new ideas for ways you might handle multiline Strings with Ruby. You don't need heredocs everywhere, but they can clean things up with the right usage.

Comments (5)
  1. George Anderson
    George Anderson October 2nd, 2008 Reply Link

    I thought I knew heredocs, but I never knew about the <<END_SQL.gsub(/\s+/, " ").strip trick. Thanks for the solid post.

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
  2. Dr Nic
    Dr Nic October 3rd, 2008 Reply Link

    I never knew about <<'SINGLE_QUOTED' nor the difference between <<STR and <<-STR. Thanks!

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
  3. darin
    darin October 3rd, 2008 Reply Link

    Just great - I really enjoy posts that look at one small feature and go in deep. Like the other commenters, there are some neat aspects to heredocs that I didn't know or forgotten about. Thanks!

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
  4. Charlie Flowers
    Charlie Flowers May 21st, 2009 Reply Link

    This is good stuff. Thank you.

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
  5. T. Pospíšek
    T. Pospíšek December 13th, 2010 Reply Link

    Two years later and still your post (the gsub hack) was very useful to me - just what I was looking for, thanks!

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
Leave a Comment (using GitHub Flavored Markdown)

Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

Ajax loader