2
OCT2008
Working With Multiline Strings
I imagine most Rubyists are aware that Ruby has "heredocs," but do you really know all they can do? Let's find out.
A "here document" is a literal syntax for a multiline String
. In the most basic form, they look like this:
p <<END_HEREDOC
This is a
multiline,
as is String!
END_HEREDOC
# >> "This is a\n multiline,\nas is String!\n"
The <<NAME
syntax introduces the heredoc, but it actually begins at the start of the following line. It continues until NAME
occurs again, at the beginning of a line. Note the trailing newline in the example above. All of the data between start and finish is packaged up into a String
and dropped in where the original <<NAME
designator appeared.
There are some important details in that description, namely that the String
begins on the next line and that it's inserted where the heredoc was started. This means that the rest of the line where the heredoc is started can have normal Ruby code (though your editor may syntax highlight it badly):
p <<END_SQL.gsub(/\s+/, " ").strip
SELECT * FROM users
ORDER BY users.id DESC
END_SQL
# >> "SELECT * FROM users ORDER BY users.id DESC"
The spacing in the above example was to make it easier for a human to understand, but I use gsub()
and strip()
to normalize the actual String. You can do that since the heredoc doesn't begin until the next line.
Taking it one step further, the content of the rest of the line can include another heredoc. The second one will begin on the line after the first ends. This continues on down for however many you care to make:
def send_messages(*messages)
messages.each { |m| p m }
end
send_messages(<<END_MESSAGE_ONE, <<END_MESSAGE_TWO)
This is message one.
...
END_MESSAGE_ONE
Another message.
....
END_MESSAGE_TWO
# >> "This is message one.:\n\n...\n"
# >> "Another message.\n\n...\n"
Another interesting thing to know about heredocs is that they are double-quoted String
s by default. You can use any escapes allowed there as well as interpolation:
p <<DOUBLE_QUOTED
Tricks:
\tindented
\t#{100 + 23}
DOUBLE_QUOTED
# >> "Tricks:\n\tindented\n\t123\n"
However, if you would prefer single-quoted behavior, you can just surround the heredoc name with single quotes:
p <<'SINGLE_QUOTED'
Tricks:
\tindented
\t#{100 + 23}
SINGLE_QUOTED
# >> "Tricks:\n\\tindented\n\\t\#{100 + 23}\n"
There's one more trick with regard to heredoc syntax. If you begin with <<-NAME
, the end marker can be indented on the line it appears on. This is mostly useful when you want to inline some code, for example:
module HateMacro
def self.generate_hate(target)
module_eval <<-END_RUBY
def self.hate_#{target}
puts "#{target.to_s.capitalize} sucks!"
end
END_RUBY
end
end
HateMacro.generate_hate(:emacs) # Just for Jim Weirich!
HateMacro.hate_emacs
# >> Emacs sucks!
That String
of code will have a bunch of whitespace at the beginning of each line. The space before the end marker is not counted though. This doesn't affect the code of course, but you need to keep it in mind for other content.
A final point of interest that may be of value to TextMate users: TextMate will properly syntax highlight the contents of <<-SQL
, <<-HTML
, and <<-CODE_FOR_EVAL
(as Ruby) heredocs. You do need to use the indented form though even if you don't indent the content.
Hopefully all of this gives you some new ideas for ways you might handle multiline String
s with Ruby. You don't need heredocs everywhere, but they can clean things up with the right usage.
Comments (5)
-
George Anderson October 2nd, 2008 Reply Link
I thought I knew heredocs, but I never knew about the
<<END_SQL.gsub(/\s+/, " ").strip
trick. Thanks for the solid post. -
I never knew about
<<'SINGLE_QUOTED'
nor the difference between<<STR
and<<-STR
. Thanks! -
Just great - I really enjoy posts that look at one small feature and go in deep. Like the other commenters, there are some neat aspects to heredocs that I didn't know or forgotten about. Thanks!
-
This is good stuff. Thank you.
-
Two years later and still your post (the
gsub
hack) was very useful to me - just what I was looking for, thanks!