Early Steps

My content targeted at the newer Rubyists among us.

7

JAN
2009

The Evils of the For Loop

I've never liked the for…in loop in Ruby. I cringe every time I see it in examples (Rails seems to put it in views a lot) and I tend to switch it to an each() call. It really bugs me.

That's mostly just my gut reaction, but if I had to put it into words it's that I fell in love with Ruby's iterators early on and for just doesn't seem to fit in well with them. I don't think that's just my emotions talking either, it really doesn't fit in. I'll try to show you why I say that.

First, let's see what I'm talking about. We are all pretty comfortable with each(), right?

(1..3).each { |i| p i }
# >> 1
# >> 2
# >> 3

I doubt that surprises anyone. Many of you probably also know that Ruby allows you to write that as:

for i in 1..3
  p i
end
# >> 1
# >> 2
# >> 3

That's almost the same thing. It really does use each() under the hood, for example:

class MyEachThing
  def each
    yield 1
    yield 42
    yield 2
    yield 42
    yield 3
  end
end

for i in MyEachThing.new
  p i
end
# >> 1
# >> 42
# >> 2
# >> 42
# >> 3

However, there's one tiny difference that ends up being pretty critical. The for version really translates closer to this Ruby 1.8 code:

for i in whatever
  # ...
end
# translates to:
i = nil  # assuming i didn't already exist
whatever.each do |i|
  # ...
end

It turns out that insignificant looking assignment makes all the difference. For those who aren't familiar with why it matters, have a look at this surprising Ruby 1.8 code:

a = 1
(3..5).each { |a|  }
p a
# >> 5

Blocks in Ruby 1.8, like the one used by the each() iterator here, will reuse a local variable if it exists before the block is entered. You can see the affect of that here in that the value of my variable changed, even though my iterator really didn't do anything. It just reassigned my variable a few times.

Don't worry if the above behavior seems surprising and/or evil. It is. As a result, it was removed in Ruby 1.9. However, for is unchanged in Ruby 1.9 and still best avoided, in my opinion.

Using this knowledge, we can now see one often harmless side effect of using for over each():

for i in 1..3
  # ...
end
p i
# >> 3

Notice that i still exists after the loop. That's not the case with a simple each() (as long as it didn't exist before the call):

(1..3).each { |i|  }
p i
# ~> -:4: undefined local variable or method `i' for main:Object (NameError)

This is important so let me say it one more way. Calling (1..3).each { |i| } without an existing i variable works like this:

  1. Create a local i variable
  2. Assign 1 to i
  3. Run block
  4. Create a local i variable
  5. Assign 2 to i
  6. Run block
  7. Create a local i variable
  8. Assign 3 to i
  9. Run block

However, running the same code with an existing i variable, or using for, changes the steps to:

  1. Create an i variable unless it exists (only for does this)
  2. Assign 1 to i
  3. Run block
  4. Assign 2 to i
  5. Run block
  6. Assign 3 to i
  7. Run block

Notice how we are missing a bunch of variable creation steps there? That's because there's only one variable now and it's not local to the block.

I bet you are wondering why I've made such a big deal out of this tiny change by now. Why does this matter? Let me show you one more set of examples to answer that. First, this code shouldn't surprise you:

results = [ ]
(1..3).each do |i|
  results << lambda { i }
end
puts results.map { |l| l.call }
# >> 1
# >> 2
# >> 3

But does the following example surprise you? It sure surprised me the first time I saw it:

results = [ ]
for i in 1..3
  results << lambda { i }
end
puts results.map { |l| l.call }
# >> 3
# >> 3
# >> 3

You should be able to reason it out now though. Remember, there's only one i variable in this example. The lambda() blocks all refer to the same variable, because they are closures. By the time we run any of those blocks, the variable holds its final value.

This example may seem a little contrived, but the truth is that it's not. I just simplified an actual bug report down to the core problem so I could explain it. Because we use blocks so much in Ruby, it's actually pretty easy to run into issues like this.

The moral (in my opinion, of course): for is evil because it's surprising and hard to think through. Avoid it.

Comments (9)
  1. Tim Morgan
    Tim Morgan January 7th, 2009 Reply Link

    Thanks for this post! I wonder how many times that bit me and I didn't even know it!!?

    I don't use the for..in thinger, but even the fact that each() steps on my existing variables is probably cause for some of my hair-pulling. I'm ready for 1.9.

    I notice you waited so very patiently for 1.9 to be right around the corner before you started bad-mouthing 1.8. :-)

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
    2. James Edward Gray II
      James Edward Gray II January 8th, 2009 Reply Link

      I don't use the for..in thinger, but even the fact that each() steps on my existing variables is probably cause for some of my hair-pulling. I'm ready for 1.9.

      You can actually use 1.9 to find these bugs in 1.8 code. Dave Thomas did exactly that recently while upgrading some code that runs in TextMate to be 1.9 friendly. He ran our scripts through 1.9, just to make sure they were working there. Because 1.9 has some new and improved warnings, it found bugs for us as an added bonus:

      $ ruby_dev -vwe 'a = 1; (3..5).each { |a| }'
      ruby 1.9.1 (2008-12-30 patchlevel-0 revision 21203) [i386-darwin9.6.0]
      -e:1: warning: shadowing outer local variable - a
      

      I notice you waited so very patiently for 1.9 to be right around the corner before you started bad-mouthing 1.8.

      I really didn't mean it to be too negative. Consider it a sign of good things to come.

      I adore Ruby and that includes 1.8, but I hope we all know it's not perfect in everything it does. ;)

      1. Reply (using GitHub Flavored Markdown)

        Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

        Ajax loader
  2. Ryan Bates
    Ryan Bates January 8th, 2009 Reply Link

    I agree for loops are ugly, I hate to see them in plain Ruby code but have grown used to them in the Rails views.

    Out of curiosity, if this problem with for..in stepping on local variables was fixed in 1.9, would you still have issues with using it?

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
    2. James Edward Gray II
      James Edward Gray II January 8th, 2009 Reply Link

      Out of curiosity, if this problem with for..in stepping on local variables was fixed in 1.9, would you still have issues with using it?

      I would still prefer each().

      I feel it is better to stick with the uniform iterator approach throughout. For example, it's easier to turn each() into each_with_index() than for. I also feel it's easier for us to just teach the iterator approach to new Rubyists if we don't bother with unneeded exceptions like the for loop.

      That's all just one guy's opinion though.

      1. Reply (using GitHub Flavored Markdown)

        Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

        Ajax loader
  3. Adam Keys
    Adam Keys January 9th, 2009 Reply Link

    While my take is nowhere near as nuanced as yours, you can always refer to my version when you're feeling sassy: http://therealadam.com/archive/2008/01/06/the-barbarism-of-the-for-loop/

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
  4. texec
    texec April 16th, 2009 Reply Link

    I think it's not a bug, it's feature. Consider the same code in other languages like C(++), PHP or Python - it works like the same.

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
  5. Chris Mear
    Chris Mear April 17th, 2009 Reply Link

    A recent post by Andrej Bauer about a similar thing in Python:

    http://math.andrej.com/2009/04/09/pythons-lambda-is-broken/

    The discussion there, and the follow-up post:

    http://math.andrej.com/2009/04/11/on-programming-language-design/

    are interesting from the point of view of comparing how a language does work (from a comprehending-the-current-implementatino perspective) to how a language should work (from a general language design perspective).

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
  6. Mark Wilden
    Mark Wilden April 18th, 2009 Reply Link

    for...in is easier to type and easier to read. You would only have problems with existing variables if your methods are too long.

    That said, I use each() because I don't want the cool kids to sneer at me.

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
  7. Arlen
    Arlen November 4th, 2010 Reply Link

    JavaScript has the same problem!

    var items = ["apple", "banana", "cherry"]
    var results = []
    
    for (var i in items) {
        // i is the index, i.e. items[i] will be "apple", ...
        results.push(function() { return i; })
    }
    
    results.forEach(function(i) {
        document.write(i() + " ")
    })
    

    You get 2 2 2 instead of 0 1 2, as the i in for (var ... in ... ) is actually a single variable, rather than instantiated afresh (and thus different in each closure) each time.

    (note that I'm ignoring the actual values of items above, just using the array to get indices. In a real life situation, you might be indexing the array with i in the closure, expecting to do something to a different member each time, whereas they'd all be operating on "cherry" here.)

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
Leave a Comment (using GitHub Flavored Markdown)

Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

Ajax loader