Language Comparisons

Comparisons of programming languages, usually Ruby and something else.

21

MAR
2006

The Why and How of Iterators

A friend of mine has been asking some general questions about iterators in private emails we have traded. I wanted to put some of my answers here, in case they appeal to a wider audience.

Why do we have iterators?

First, let's invent a little data to play with:

>> Name = Struct.new(:first, :last)
=> Name
>> names = [ Name.new("James", "Gray"),
?>           Name.new("Dana", "Gray"),
?>           Name.new("Caleb", "Nordloh"),
?>           Name.new("Tina", "Nordloh") ]
=> [#<struct Name first="James", last="Gray">,
    #<struct Name first="Dana", last="Gray">,
    #<struct Name first="Caleb", last="Nordloh">,
    #<struct Name first="Tina", last="Nordloh">]

Now let's assume we want to print some names. We can use the each() iterator for that, no index:

>> names.each { |name| puts "#{name.last}, #{name.first}" }
Gray, James
Gray, Dana
Nordloh, Caleb
Nordloh, Tina
=> [#<struct Name first="James", last="Gray">,
    #<struct Name first="Dana", last="Gray">,
    #<struct Name first="Caleb", last="Nordloh">,
    #<struct Name first="Tina", last="Nordloh">]

That's not too different from a loop, but let's say we want to find() a specific name:

>> names.find { |name| name.first == "Caleb" }
=> #<struct Name first="Caleb", last="Nordloh">

Maybe we would just like to see which last names are in our records. We can do that by map()ping (another iterator) the names to just the last name and using a simple helper:

>> names.map { |name| name.last }.uniq
=> ["Gray", "Nordloh"]

Perhaps we just want to work with some members of the set. We can select() those:

>> names.select { |name| name.first =~ /^(?:J|C)/ }
=> [#<struct Name first="James", last="Gray">,
    #<struct Name first="Caleb", last="Nordloh">]

Or we can sort_by() certain criteria:

>> names = names.sort_by { |name| [name.last, name.first] }
=> [#<struct Name first="Dana", last="Gray">,
    #<struct Name first="James", last="Gray">,
    #<struct Name first="Caleb", last="Nordloh">,
    #<struct Name first="Tina", last="Nordloh">]

Now remember, in loop centric languages you have one tool for most of this and it looks something like:

for (int i = 0; i < ...; i++) {
  ...
}

You have to provide all the other details each and every time you want to find one object in a list, or toss some objects out. You track the indices, you manage the new Array/Hash/whatever you are putting things in, you break out of the loop when you are finished, etc. Notice how in all the Ruby example above I am only ever focused on the individual object and what I need to do to it. The iterators are handling all of the repetitive and boring work for me, leaving me to focus on the task at hand.

I guess you could say the downside is that you have to learn all the iterator names and what they do instead of just one loop. Ruby tries to lessen the blow on this though, by sticking all the iterators in one mix-in and using that same set in all the standard objects, like Array and Hash. Even better, all you have to do is define the easy each() iterator, mix-in Enumerable just like those standard classes do, and you get all the other iterators for free. So basically, you just learn them once and use them everywhere.

How do we build iterators?

Let's say we want to make a LinkedList in Ruby. Something like:

>> class LinkedList
>>   def initialize(head)
>>     @node = head
>>     @next = nil
>>   end
>>   def value
>>     @node
>>   end
>>   def next(value = nil)
>>     unless value.nil?
>>       @next = self.class.new(value)
>>     end
>>     @next
>>   end
>> end
=> nil

Let's build a quick routine to populate it with some data:

>> def fib_seq
>>   start = LinkedList.new(0)
>>   first = start
>>   sec   = first.next(1)
>>   100.times do
?>     new_node = sec.next(first.value + sec.value)
>>     first    = sec
>>     sec      = new_node
>>   end
>>   start
>> end
=> nil
>> fib = fib_seq
=> ...

Now, let's write the each() iterator for this class, to allow users to walk the values. We will use an optional limit too, since the lists can get quite long:

>> class LinkedList
>>   def each(limit = nil)
>>     current = self
>>     until current.nil? || (!limit.nil? && limit == 0)
>>       yield(current.value)
>>       current = current.next
>>       limit  -= 1 unless limit.nil?
>>     end
>>   end
>> end
=> nil

Notice how I just use yield to hand the values to the block as I get to each one.

Let's see how that works in action:

>> fib.each(3) { |n| puts n }
0
1
1
=> nil
>> fib.each(10) { |n| puts n }
0
1
1
2
3
5
8
13
21
34
=> nil

Let's write one more, the find() iterator (technically we could use a mix-in to get it for free, but seeing it is instructive):

>> class LinkedList
>>   def find(limit = nil)
>>     results = Array.new
>>     each(limit) do |value|
?>       results << value if yield(value)
>>     end
>>     results
>>   end
>> end
=> nil

Here I am using yield to see if the user is interested in this value. I pass the value into the block and expect it to return a true/false answer.

We can use that to find out which of the first 100 Fibonacci numbers are divisible by three:

>> fib.find { |n| n % 3 == 0 }
=> [0, 3, 21, 144, 987, 6765, 46368, 317811, 2178309, 14930352, 102334155,
    701408733, 4807526976, 32951280099, 225851433717, 1548008755920,
    10610209857723, 72723460248141, 498454011879264, 3416454622906707,
    23416728348467685, 160500643816367088, 1100087778366101931,
    7540113804746346429, 51680708854858323072, 354224848179261915075]
Comments (3)
  1. Felix McAllister
    Felix McAllister March 22nd, 2006 Reply Link

    Good post - well explained.

    One thing - if you're putting in code fragments could you leave out the irb output (e.g. >> and =>)? It would make it easier for me to paste into an editor when trying stuff out.

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
    2. Pat
      Pat March 22nd, 2006 Reply Link

      Very useful post.

      Felix: It's more helpful to readers to see what is actually happening, I'm sure. To get rid of the >> stuff you can just paste into an editor and use column editing to remove it.

      1. Reply (using GitHub Flavored Markdown)

        Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

        Ajax loader
    3. James Edward Gray II
      James Edward Gray II March 22nd, 2006 Reply Link

      Thanks for the feedback Felix, but Pat pretty much nailed my thinking. I want readers to see both the cause and effect of the code which makes >> and => critical for telling them apart.

      Maybe we should do a Ruby Quiz to strip them out though... ;)

      1. Reply (using GitHub Flavored Markdown)

        Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

        Ajax loader
Leave a Comment (using GitHub Flavored Markdown)

Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

Ajax loader