5
JAN2006
Code as a Data Type
Introduction
This is the first of a series of articles where I will try to demystify some Ruby idioms for the people who come to Ruby through Rails and find themselves wanting to learn a little more about the language under the hood.
Strings, Arrays, ... and Code?
You don't have to code for long in any language before you get intimately familiar with some standard data types.  We all have a fair grasp of Ruby's String and Array, because every language has something similar.  Ruby has an unusual data type though, which can trip up newcomers.  That type is Ruby code itself.
Allow me to explain what I mean, through an example. First, let's create a little in-memory database to work with:
class ClientDB
  Record = Struct.new(:client_name, :location, :projects)
  def initialize
    @records = [ Record.new( "Gray Soft", "Oklahoma",
                             ["Ruby Quiz", "Rails Extensions"] ),
                 Record.new( "Serenity Crew", "Deep Space",
                             ["Ship Enhancements"] ),
                 Record.new( "Neo", "Hollywood", 
                             ["Rails interface for the Matrix"] ) ]
  end
end
Of course we need to be able to query this data. Let's add a very basic query routine:
class ClientDB
  def select(query)
    # parse query String
    field, value = query.split(/\s*=\s*/)
    value.sub!(/\A['"](.+)['"]\z/, '\1')
    # match records
    @records.select { |record| record.send(field) == value }
  end
end
Finally, we should be able to make some queries:
require "pp"
db = ClientDB.new 
pp db.select("client_name = 'Gray Soft'")
pp db.select("location    = 'Hollywood'")
We have the beginnings of a system here, but my queries aren't very powerful yet. Let's add support for a single boolean operator:
class ClientDB
  def select(query)
    # parse query String
    rules = Hash[ *query.split(/\s*AND\s*/).map do |rule|
      rule.split(/\s*=\s*/).map { |value| value.sub(/\A['"](.+)['"]\z/, '\1') }
    end.flatten ]
    # match records
    @records.select do |record|
      rules.all? { |field, value| record.send(field) == value }
    end
  end
end
The good news? We can now enter queries like:
pp db.select("client_name = 'Gray Soft' AND location = 'Oklahoma'")
But there's a lot of bad news too:
- My query language hack is weak and easily broken
- We don't support many great operators like ORand!=
- We still have no access to the projects field
It's clear that what we really need here is a complete language. But wait, aren't we already using a language? What about Ruby?
This is the line of thinking that leads to Ruby's code blocks.  Any method in Ruby can have a bit of code included with the call.  That method can then yield to that bit of code, pass it parameters, and even get a return value from it.  Let's rework select() again, but this time to use Ruby's code blocks:
class ClientDB
  def select
    @records.select { |record| block_given? && yield(record) }
  end
end
In my experience, you know you're doing Ruby right when you are dropping code and gaining functionality. So how did we do? Check out these queries:
pp db.select { |record| record.client_name != "Gray Soft" }
pp db.select { |record| record.client_name =~ /crew/i }
pp db.select { |record| record.projects.size == 1 }
pp db.select { |record| record.projects.include?("Ruby Quiz") }
# and much, much more...
As you can see, we now have a full language at our command.  We have a complete range of operators, access to goodies like Regular Expressions, and we can easily query the Array sub field in any way that we need.
Lambda
All this block stuff is very handy, as you can see. Soon after you start using them, you're going to think, "Now if I could just stick this block in a variable..."
Enter lambda().
Tools like block_given?() and yield() can only be used inside the method we passed the block to.  Sometimes we want to hold onto a block object though and pass it around a little:
class ClientDB
  def count(counter)
    counter.call(@records.size)  # call some passed in code block object
  end
end
count = 0
counter = lambda { |items| count += items }
db.count(counter)
pp count  # => 3
In this example, lambda() will just wrap up the provided block in an object you can pass around and use as needed. I've only used it to count a single database here, but I could have easily counted more if we had them.
This example shows off one more interesting aspect of Ruby's blocks: They are closures. Uh oh, ugly Computer Science term alert! It's not as complicated as it sounds. Relax.
See how the lambda() makes use of the local variable?  That variable gets updated, even though the lambda() object may get passed away and run who knows where.  That's what it means to be a closure.  Ruby's blocks "close-up" the binding of where they are created and take it with them.  That means they can use local variables and the current value of self.
Putting it all Together
I bet you're wondering what all this has to do with our database example. Let me tell you...
Ruby has one more shortcut for blocks. You can automatically have them wrapped up into an object (or even unwrapped!) at the time of the method call. In other words, I could have written select() like this:
class ClientDB
  def select(&query)
    @records.select { |record| query && query.call(record) }
  end
end
That funny & symbol on the last variable of a method's parameter list tells Ruby, "Just pretend I had wrapped that block with lamdba(), okay?"  Ruby will wrap it up and stick it in the variable for you.
Now that didn't change anything for our select() routine, but let's try using the same knowledge in a different way:
class ClientDB
  include Enumerable  # use a mix-in to get select(), map(), to_a(), ...
  def each(&block)    # Enumerable requires each()
    @records.each(&block)
  end
end
I'm only using one new trick here:  In each(), I delegate to the each() method of @records by unwrapping the block.  The effect of that is the same as if @records.each() had been called directly and passed the block.
Since we defined an each(), we can mix-in Enumerable (a future article topic?) to get a whole slew of new options!  Watch this:
pp db.inject(0) { |sum, record| sum + record.projects.size }
pp db.map { |record| record.client_name }
pp db.find { |record| record.client_name == "Gray Soft" }
# and much, much more...
That ends my tour of Ruby's blocks. Remember, the easiest way to let all this sink in is to tell yourself that code is just another data type in Ruby. Don't be too surprised if that gets you solving problems in a whole new way...
Comments (4)
- 
     Gregory Brown January 8th, 2006 Reply Link Gregory Brown January 8th, 2006 Reply LinkIn my experience, you know you're doing 
 Ruby right when you are dropping code
 and gaining functionalityThis is almost exactly the words I used to describe the refactoring process in Ruby to my employer. Great article, James! :) 
- 
     Looks like you have the beginnings of a mock database system here. You know, we could really use that for the DBI refactoring effort. HINT HINT - 
     Daniel: See KirbyBase for a much more complete example. 
 
- 
    
- 
     Hey, I am here 5 years late... It looks like I have a lot of reading to do. Great explanation of the blocks btw. 
