Ruby Voodoo

Deep dives into random corners of my favorite programming language.

10

OCT
2008

All About Struct

I build small little data classes all the time and there's a reason for that: Ruby makes it trivial to do so. That's a big win because we all know that what is a trivial data class today will be tomorrow's super object, right? If I start out using a simple Array or Hash, I'll probably end up redoing most of the logic at both ends eventually. Or I can start with the trivial class and grow it naturally.

The key to all this though is that I don't write those classes myself! That's what Ruby is for. More specifically, you need to learn to love Struct. Allow me to show you what I mean.

Imagine I need a basic class to represent a Contact. Ruby gives us so many shortcuts that the class could be very small even without Struct:

class Contact
  def initialize(first, last, email)
    @first = first
    @last  = last
    @email = email
  end

  attr_accessor :first, :last, :email
end

You could shorten that up more with some multiple assignment if you like, but that's the basics. Now using Struct is even easier:

Contact = Struct.new(:first, :last, :email)

To be fair, that's not 100% the same as the previous code. My original class required that three arguments get passed to the constructor whereas Struct is more lenient:

p Contact.new(*%w[James Gray james@grayproductions.net])
# >> #<struct Contact first="James",
#                     last="Gray",
#                     email="james@grayproductions.net">
p Contact.new(*%w[James Gray])
# >> #<struct Contact first="James", last="Gray", email=nil>
p Contact.new("James")
# >> #<struct Contact first="James", last=nil, email=nil>
p Contact.new
# >> #<struct Contact first=nil, last=nil, email=nil>

As you can see, all arguments to the constructor Struct builds for you are optional. It will just fill in the passed values, from left to right. This may or may not be an advantage for your needs.

Now lets look once more at how I built that Struct:

ClassName = Struct.new(...)

Struct::new() builds and returns Class objects. If you then assign that to a constant, you can pretty much treat it like any other class you build. Ruby usually just handles the constant assignment for you.

You can forgo the assignment to a constant if you pass the first argument as a constant name in String form:

Struct.new("Contact", :first, :last, :email)

This would not define a top-level Contact, but instead a Struct::Contact. Given that, your name must be unique among all Structs defined when using this approach.

Getting back to our fledgling Contact, it's important to note that it does have all the getter and setter methods for the attributes:

c = Contact.new
p c
# >> #<struct Contact first=nil, last=nil, email=nil>
c.first = "James"
c.last  = "Gray"
p c
# >> #<struct Contact first="James", last="Gray", email=nil>
p c.last
# >> "Gray"

Again, Struct always defines both getter and setter for all attributes. That may or may not work for you.

You can also set values using a Hash like syntax:

c[:email] = "james@grayproductions.net"
p c["email"]
# >> "james@grayproductions.net"

String and Symbol keys are interchangeable.

Another awesome feature is that Struct gives you other ways to go through this data. Here are three of my personal favorites:

p c.members
# >> ["first", "last", "email"]
p c.values
# >> ["James", "Gray", "james@grayproductions.net"]
c.each_pair do |name, value|
  puts "#{name}: #{value}"
end
# >> first: James
# >> last: Gray
# >> email: james@grayproductions.net

We get a lot of functionality for free. That's obvious. But eventually you are always going to want to add your own. In times past, the following was a common idiom for that:

class Contact < Struct.new(:first, :last, :email)
  # ...
end

This is a neat example because it shows how flexible Ruby is. The parent Class for a Class definition doesn't have to be just a constant name. It can actually be any code that results in a Class object. Then we can just inherit from it and add to it as we like.

However, projects like Rails have shown the error of this approach. Because Rails is often dynamically reloading code, Class definitions will be rerun. That means the Struct call will happen again, resulting in a fresh parent Class object (which happens to have the exact same behavior). Ruby will see the new parent for an existing definition and choke with an error:

superclass mismatch for class Contact (TypeError)

The good news is that this isn't much of an issue because Struct plans for it. It's prepared to accept a block during definition and the contents of that block will be evaluate within your new Class. Thus it's trivial to add methods:

Contact = Struct.new(:first, :last, :email) do
  def to_hash
    Hash[*members.zip(values).flatten]
  end
end

# ...

p c.to_hash
# >> {"last"=>"Gray",
#     "first"=>"James",
#     "email"=>"james@grayproductions.net"}

There is a gotcha when defining methods this way though. Struct cheats on the internal implementation and doesn't actually place values in real instance variables. Thus, you will need to stick to accessing your data through the method interface:

Contact = Struct.new(:first, :last, :email) do
  def full
    "#{first} #{last}".strip
  end
end

# ...

p c.full
# >> "James Gray"

If Struct isn't dynamic enough for you, you may want to examine the standard OpenStruct library. It's essentially a Hash with a method interface, allowing you to change attributes as needed. You can also initialize it with a Hash, if needed:

require "ostruct"

name = OpenStruct.new(:first => "James", :last => "Gray")
p name.last
# >> "Gray"

name.suffix = "II"  # add an attribute
p name.suffix
# >> "II"

Sadly, OpenStruct is missing most of the niceties of Struct. Because of that, I don't feel it buys you much over a Hash.

Keep Struct in mind next time you need a simple data object. It's hardly any effort to setup, it comes fully loaded with options, and it can grow as your needs do.

Comments (8)
  1. Ryan
    Ryan July 29th, 2010 Reply Link

    Wow, this was a huge help. I have been beating my head against my desk trying to figure out how to properly encode a string. Kept getting ASCII 8-Bit to UTF-8 errors, but using encode! with undef and replace in the options hash worked like a charm. Thanks!

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
  2. damphyr
    damphyr October 24th, 2008 Reply Link

    Nice! One more trick up the sleeve.

    Essentially though the Constant=Struct.new{} gives us another idiom for defining classes - which will prove confusing for a few people. I love it!

    Once upon a time (we're talking the long ago of internet time here) I used OpenStruct a lot instead of mocks. It's still lounging deep in some of my configuration code.
    Actually with the use of mocks I further reduced OpenStruct to the cases where I want to retrieve something from the mocked instance (just a contrived example):

    m=OpenStruct.new
    m.expects(:something).returns("something_else")
    testable_target.does_something_with(m)
    assert_equals("expected",m.foo)
    
    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
  3. John Conti
    John Conti December 15th, 2009 Reply Link

    James,

    Thanks for the post. I like the bit about creating a Struct with a block to define methods. In looking at the generated documentation at ruby-doc.org, I don't see this functionality.

    Is this an omission in the docs? How did you come across such a nice juicy tidbit?

    Thank you.

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
    2. James Edward Gray II
      James Edward Gray II September 7th, 2011 Reply Link

      I'm pretty sure Ruby does not compare contents. That could get pretty inefficient on large Strings.

      I believe the case is that it compares encodings, with the special exception for 7-bit ASCII Strings.

      1. Reply (using GitHub Flavored Markdown)

        Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

        Ajax loader
    3. James Edward Gray II

      Ruby has another interesting exception when it comes to Encoding incompatibility that may be worth mentioning. You can't typically add Strings with incompatible Encodings as this shows:

      $ cat incompatible.rb 
      # encoding: UTF-8
      utf8 = "一"
      sjis = "二".encode("Shift_JIS")
      puts "#{utf8.encoding} + #{sjis.encoding} ="
      utf8 + sjis
      $ ruby incompatible.rb 
      UTF-8 + Shift_JIS =
      incompatible.rb:5:in `<main>': incompatible character encodings:
      UTF-8 and Shift_JIS (Encoding::CompatibilityError)
      

      However, Ruby does keep an eye on String content (mainly for optimization purposes) and when both Strings contain only 7-bit ASCII, an exception will be made:

      $ cat ascii.rb 
      # encoding: UTF-8
      
      utf8 = "abc"
      sjis = "def".encode("Shift_JIS")
      
      print "Given all ASCII data:  " if [utf8, sjis].all?(&:ascii_only?)
      print "#{utf8.encoding} + #{sjis.encoding} = "
      
      result = utf8 + sjis
      $ ruby ascii.rb 
      Given all ASCII data:  UTF-8 + Shift_JIS = UTF-8
      

      There are a few points of interest in this little example. First, note the ascii_only?() method to check for these special cased Strings. Next, notice that Ruby did do the concatenation, even though these are not compatible Encodings. Finally, the result had an Encoding of UTF-8, simply because that was the Encoding of the first (leftmost) String. It would have been Shift_JIS had I reversed them.

      I still don't really recommend relying on these special behaviors though. I believe you will encounter less problems if you stick to my advice of normalizing String Encodings before working with mixed data.

      1. Reply (using GitHub Flavored Markdown)

        Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

        Ajax loader
      2. Joe Marty
        Joe Marty September 7th, 2011 Reply Link

        Thank you for this amazingly helpful series!
        I was curious about your comment regarding an "interesting exception when it comes to Encoding incompatibility" when using only 7-bit ASCII. Is this actually an exception? The impression I got was that the .compatible? method would compare string content and find out if one of the encodings could (or if there is an encoding that could) potentially contain all of the characters in both strings. Or does it simply compare the encodings against a table of 100% compatible encodings, and return the result?

        In the former case, the fact that both strings contain 7-bit ASCII is just a coincidence, not a special exception, and what really matters is the fact that all the characters in both strings can be encoded in UTF-8, so compatible? returns UTF-8, and adding strings, therefore, uses UTF-8... however I have not tried this or figured out how to setup an experiment to see if other cases work the same way.

        Do you know which is the case?

        1. Reply (using GitHub Flavored Markdown)

          Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

          Ajax loader
        2. James Edward Gray II
          James Edward Gray II September 7th, 2011 Reply Link

          I'm pretty sure Ruby does not compare contents. That could get pretty inefficient on large Strings.

          I believe the case is that it compares encodings, with the special exception for 7-bit ASCII Strings.

          1. Reply (using GitHub Flavored Markdown)

            Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

            Ajax loader
  4. Chris Patuzzo
    Chris Patuzzo September 13th, 2012 Reply Link

    I find myself doing this quite a bit:

    class Foo < Struct.new(:a, :b, :c)
      def initialize(*args)
        super
        extra_initialization
      end
    end
    

    It's not the prettiest, but it gives you all that Struct goodness without the limitation of no initializer and without duplicated arguments.

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
  5. James Edward Gray II
    James Edward Gray II January 11th, 2010 Reply Link

    Without knowing where the error comes from, the only thing I can see that might be an issue is if params[:id] contained non-UTF-8 bytes. This question is probably better asked on the Rails mailing list tough.

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
Leave a Comment (using GitHub Flavored Markdown)

Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

Ajax loader