1
APR2012
A Stylish Critique
Before getting started, I feel compelled to point out that my dictionary defines a critique as "a detailed analysis and assessment of something." It seems like we often assume the worst of that word, but that's not how I intend it here.
The Ruby community seems to be talking about style guides lately. So let's talk about them.
The fact is that you will have many choices if you go looking for style guides for our favorite language. You can pick from:
- The old stuff
- Google's way
- GitHub's popular guide
- Guides from respected Rubyists, like Dan Kubb
- One of the many forks of the popular guides on GitHub
- And many more
It's obvious that Rubyists care about this topic. Let's see what's out there and consider what really is and is not useful from these guides.
What is a style guide really, and why do we even have them?
Defining style guides is surprisingly tough. I suspect they started out as formatting rules for code, but they have evolved pretty far beyond that now.
Most guides include general conventions that the author feels are important when writing the language in question. This can go all the way down to how to use certain constructs, opinions on what the author considers idiomatic, and syntax to outright avoid.
To sum it up, a style guide is often a "How I Write Language X" opinion piece from the author.
Style guides do exist for pretty good reasons. First, most programmers put some effort into making sure that their code looks a certain way. Generally this is to make it easier to read. However, you may not see the code as intended if you are using a different indent size or tabs instead of the spaces they intended.
With teams, these conventions become even more important. Teams are trying to work in concert, so they need to be able to understand code not written by them. A set of conventions the team follows can really help this along.
And when you think about it, who isn't on a team these days? Aren't open source developers a team? The entire Ruby community might even be a team with the way we interact online.
It seems like this is a topic we should all care about, at least a little.
The Rules for the Rules
Even though we have decided this topic is important, we must never forget how careful we have to be when imposing rules on others. The goal is to make communication between us easier, not to restrict creativity.
In my experience, one rule seems to hold sway over all other rules in life whether those rules are for parenting, laws, or programming style guides:
Use the fewest amount of rules you can possibly get away with.
If there are more rules than I can easily internalize, the entire point is defeated. Odds are that the authors don't even consistently follow their own long lists.
Now, it is worth noting that some list size variation make sense. For example, GitHub needs to keep their rules minimal for the reasons I have just mentioned. However, Dan Kubb talked to us on the Ruby Rogues about how he uses his style guide to try and restrict himself into being more aware as a programmer. It makes sense that he would have more rules for that.
That said, Dan's list is still probably a little long, in my opinion. It's just too much. Some of his rules actually encourage me to tune out what he's saying, like:
- Avoid bugs.
- Use common sense.
Again, if you lose your audience, the whole point is lost.
The classic book that describes itself as a style guide, Smalltalk Best Practice Patterns, includes ten "Formatting" patterns. That strikes me as a pretty common sense upper limit.
Another rule that's key is one that most parents know:
Have a good reason for the rule.
If your kid hits you with, "But why?," and you have to counter with, "Because I said so!," you lose. Rules need to be important. Important things have a reason for their existence. The reason needs to be a good one too.
If you are just cargo culting some rules, I would recommend actively breaking them for a time. You'll figure out pretty quick if they are important or not. If they are, you'll have the reason. If they aren't, you'll be free of a useless constraint.
For example, Ryan Davis's old rules say, "Use parens to disambiguate, otherwise avoid them." Ultimately, that leads to the "Seattle Style" of method definitions. GitHub's newer rules say, "Use def
with parentheses when there are arguments." Both sides will tell you that it reads better.
I personally find that my eye seems to drift more without the parentheses, so I prefer GitHub's style in particular this case. However, the Seattle group does have a significant following.
To me, this implies that it isn't likely worth a rule. The reason doesn't seem to hold in enough cases to be considered good. Both of these groups seem to be able to coexist in our community without creating two versions of every gem, just to rewrite the method definitions. I say let it go.
Now, both sides do agree to leave parentheses off of argument-less definitions and calls. Perhaps that means that one is rule worthy. It's definitely gray area though.
Finally, with style guides, I think it's critical that we remember the most important thing I learned from Smalltalk Best Practice Patterns:
Code's primary purpose is to communicate with the reader.
If you make a rule that violates that, it's instantly very suspect. For example, GitHub's guide has this rule:
- The
and
andor
keywords are banned. It's just not worth it. Always use&&
and||
instead.
Of course, not everyone agrees. The problem with this rule is that is that it makes Ruby less expressive. Those operators actually have a purpose. Though it's not popular, I even use them in conditionals, because I think they read better. [Update: I was eventually swayed by Avdi's argument so I no longer do this.]
GitHub is likely worried about how these operators interact with assignment statements. It can trip up new Rubyists. If you read Avdi's post though, you'll see that the precedence of these operators is actually why they exist. Even my usage of them in conditionals is rarely a problem, since assignments are not common in conditionals. Plus, if you do assign in a conditional, GitHub has a rule that fixes the precedence even with an and
/or
:
- Using the return value of
=
(an assignment) is ok, but surround the assignment with parenthesis.
That's a neat rule, if you ask me, because it allows for what I consider to be a natural programming construct in Ruby: run this code if I can assign this variable. For example:
if (message = inbox.find { |m| m.unread? })
# ... work with message here...
end
But, the parentheses flag the reader that you haven't made the common bug of confusing =
with ==
.
Getting back to the and
/or
case, I think education about these operators beats forbidding them.
What's Our Style?
Enough blathering on about how to make rules and why they exist. Let's take a look at what's actually out there for us Rubyists.
We Agree About
I think it's safe to say that the majority of Ruby style guides agree about at least these points:
- An indent should be two spaces.
- Use spaces, not tabs.
- Keep lines at 80 characters in length or less.
A lot of communities fight pretty heavily over these points, but the Ruby community seems to pretty universally agree. The reasoning is also sound: two spaces is enough to notice, but it leaves you plenty of room on the line. Spaces make sure I see what you see, even if our editors have different settings.
The last one surprises me a little. I strongly believe in it, but I see a lot of Ruby code that doesn't. Still almost all of the guides I checked had it in there.
There were many more rules about spacing in the guides. While I agreed with a lot of them, I'm not sure everyone would. Are we certain sure it's select { |arg| arg }
instead of select {|arg| arg }
? I use the former, but I see some of the latter and it doesn't throw me into a rage. I'm just not sure I'm willing to burn a rule on stuff like that.
We also seem to agree on how to name things:
- Use
snake_case
for variable and method names,CamelCase
for classes and modules, andALL_CAPS_SNAKE_CASE
for other constants.
The reasoning is pretty obvious. This helps you tell what you are looking at right away. It makes it much easier to read code.
What else do most of the style guides care about? How you indent your case
statements. They definitely prefer this style:
case age
when 0..10
puts "Too young."
when 11..65
puts "Access granted."
when 66..110
puts "Too old."
else
puts "Malformed age."
end
Or even:
response = case age
when 0..10 then "Too young."
when 11..65 then "Access granted."
when 66..110 then "Too old."
else "Malformed age."
end
But never:
case age
when 0..10
puts "Too young."
when 11..65
puts "Access granted."
when 66..110
puts "Too old."
else
puts "Malformed age."
end
Or:
response = case age
when 0..10 then "Too young."
when 11..65 then "Access granted."
when 66..110 then "Too old."
else "Malformed age."
end
The when
statements need to line up with the case
. The reasoning? I have no idea, but it looks right to me!
The Parts I Found Interesting
The Google style guide mentioned using Ruby's block comments to hide headers and footers in your code. For example:
=begin
* Name:
* Description:
* Author:
* Date:
* License:
=end
I'm not sure how I feel about that. I worry about programming constructs that can go off-screen. To give an example, I don't like this method of declaring class methods:
class Whatever
class << self
def class_method
# ...
end
end
end
The reason is that you could end up facing something line this:
class Whatever
class << self
# a page of definitions here...
def what_am_i
# ...
end
# another page of definitions here...
end
end
If you are centered on the definition of what_am_i
, the only way you can know it's defined as a class method is if you pick up on the extra indentation. I don't think that communicates well, so I prefer the following technique, even if it does introduce some mild duplication:
class Whatever
def self.class_method_1
# ...
end
def self.class_method_2
# ...
end
end
Block comments are subject to the same off-screen issue, but a smart editor is going to save you with them. It should syntax highlight the entire chunk as a comment, so you would still know what you are in the middle of.
I doubt this is rule worthy in either case, but it's interesting to think about.
I found the following rule from GitHub's guide interesting mainly because it nails my bad habit:
- Prefer single-quoted
String
s when you don't need string interpolation or special symbols such as\t
,\n
,'
, etc.
I don't do this. Instead I use double-quoted String
s most of the time. I switch to single-quoting only in rare cases where it reduces some escaping. Even then, I'm just as likely to switch to a special String
form, like %Q{...}
.
My reasons are pretty weak. It's mainly that I too often found myself having to switch a single-quoted String
to double-quotes so I could introduce some interpolation. It also helps in my editor, since I can have it always pair double-quotes while never pairing singles. I'll be typing the former to make String
s but the latter for things like apostrophes, so this is generally perfect.
The guide's rule is probably superior though, because it communicates with the reader that they are looking at a simple String
.
Dan Kubb's guide led me to read about a couple of interesting concepts that I hadn't thought about enough:
-
Try to have methods either return the state of the object and have no side effects, or return
self
and have side effects. This is otherwise known as Command-query separation (CQS): Do not program defensively. (See http://www.erlang.se/doc/programming_rules.shtml#HDR11.)
GitHub has a rule I doubt is common:
- Indent the
public
,protected
, andprivate
methods as much the method definitions they apply to. Leave one blank line above them.
Here's the sample code they give for that:
class SomeClass
def public_method
# ...
end
private
def private_method
# ...
end
end
I think it's more common to see a blank line below it as well:
class SomeClass
def public_method
# ...
end
private
def private_method
# ...
end
end
I have also seen it "outdented" like this:
class SomeClass
def public_method
# ...
end
private
def private_method
# ...
end
end
That doesn't feel right to me though, since we indent other inline class method calls. I do see how it kind of delimits a section though, much like rescue
would inside a method definition. I think you can make the argument either way.
There was another variant that you use to see a lot, but even Rails has decided it's wrong now:
class SomeClass
def public_method
# ...
end
private
def private_method
# ...
end
end
Aaron Patterson says this is wrong because it's not how Emacs indents it. I think he was joking at the time, but it turns out that he may be more right about that than we could have guessed.
The first rule in Bozhidar Batsov's "Comments" section was:
- Write self-documenting code and ignore the rest of this section. Seriously!
I really like that. The more programming experience I get the less respect I have for comments. I do use annotations, like TODO
and FIXME
(and Bozhidar has rules for those too), but most other comments are suspect, in my opinion.
The guides definitely disagreed heavily about which documentation scheme we should be using or if we should even be using one at all.
I also learned a neat trick from Bozhidar's guide that I hadn't encountered before:
-
When using
class_eval
(or othereval
) withString
interpolation, add a comment block showing its appearance if interpolated (a practice I learned from the Rails code):# from activesupport/lib/active_support/core_ext/string/output_safety.rb UNSAFE_STRING_METHODS.each do |unsafe_method| if 'String'.respond_to?(unsafe_method) class_eval <<-EOT, __FILE__, __LINE__ + 1 # ... def #{unsafe_method}!(*args) # def capitalize!(*args) @dirty = true # @dirty = true super # super end # end EOT end end
But the most entertaining rule award definitely goes to Ryan Davis for this gem:
- Don’t
rescue Exception
. EVER. Or I will stab you.
What Are They Just Wrong About?
I hate to say it, but some rules were just bad. Let's look at a handful that bugged me. First, I submit:
- Do not use single letter variable names. Avoid uncommunicative names.
I don't even think the first half of that rule agrees with the second half of that rule. Consider this code:
ary.each_index do |i|
# ...
end
There's nothing wrong with that code. i
is kind of the universal variable name for an index and as such it's even more communicative than a longer name, like index
.
- Avoid multi-line
?:
(the ternary operator), useif
/unless
instead.
While I understand where this rule is coming from, it's just a bad idea to draw hard lines in the sand like this. You're sure this is always wrong? I'm not.
Ruby 1.9 even supports a nice new syntax for this:
result = o.test? ? o.calc_true_result
: o.calc_false_result
Now that Ruby allows those operators to line up, this can be a very readable construct, in my opinion.
- Avoid using
$1-9
as it can be hard to track what they contain. Named groups can be used instead.
Almost all of the guides did contain a rule like this, but I see a lot of Ruby code and variables like $1
are extremely common. The issue is that the MatchData
methods aren't all that more expressive and it can be awkward to get a hold of the object to use them on.
I do like the named variables that this rule recommends (added in Ruby 1.9), but they can make a simple regex quite a bit wordier and introduce local variables in the recommended usage. I don't think I'm comfortable saying that's always OK.
I'm not really sure what the perceived problem is with $1
and friends either. Programmers seem to understand them just fine. Some may worry that they are evil globals, but the fact is that they are magic Thread
local variables that just do the right thing.
Until I see a good reason for this rule, I'm not buying it.
- Do not use
inject
in library code. In app code it is ok.
I saw this rule on Dan Kubb's guide earlier this week, but it was already removed before I published this article. I'm glad to see he's over this one.
I think this stems from a fear of inject()
being a little slower. It's true that it is. Here's a benchmark that shows the problem:
require "benchmark"
def each_sum(numbers)
sum = 0
numbers.each do |n|
sum += n
end
sum
end
def inject_sum(numbers)
numbers.inject(0) { |sum, n| sum + n }
end
TESTS = 10_000_000
NUMBERS = 1..10
Benchmark.bmbm do |bench|
bench.report("each_sum") do
TESTS.times do
each_sum(NUMBERS)
end
end
bench.report("inject_sum") do
TESTS.times do
inject_sum(NUMBERS)
end
end
end
That shows the following on Ruby 1.9.3:
Rehearsal ----------------------------------------------
each_sum 7.440000 0.000000 7.440000 ( 7.442971)
inject_sum 10.170000 0.010000 10.180000 ( 10.177593)
------------------------------------ total: 17.620000sec
user system total real
each_sum 7.540000 0.000000 7.540000 ( 7.546324)
inject_sum 10.220000 0.000000 10.220000 ( 10.217871)
As you can see, it's not that slow. Many iterators have a similar penalty. I had to use ten million iterations to get it to show up well.
The situation use to be worse though. It's more than twice as slow as each
on Ruby 1.8.7:
Rehearsal ----------------------------------------------
each_sum 23.340000 0.010000 23.350000 ( 23.354953)
inject_sum 51.260000 0.020000 51.280000 ( 51.279145)
------------------------------------ total: 74.630000sec
user system total real
each_sum 23.730000 0.010000 23.740000 ( 23.734525)
inject_sum 52.320000 0.010000 52.330000 ( 52.323466)
So perhaps that's where this fear came from.
In any event, don't avoid the iterators. Use them for what they are for. Remember, expressive code is always the first goal.
If you find something to be too slow, you can always refactor for performance. That should only be needed in extremely rare edge cases though. Go for readability first.
Hopefully this section reminds us all that we always need to treat these style guides as suggestions, not laws. You're the programmer, so you're in charge. No exceptions.
Worth a Read
I've dropped a ton of links in this article, but it really is worth taking a little time to read through at least some of these style guides. They cover a ton of ideas, good and bad, that I didn't choose to rehash here. If nothing else, I promise that they will get you thinking.
If you are pressed for time, start with the GitHub guide. It seems to reflect a lot of current thinking. Some of it is controversial though, as I've noted. If you have time for a second source, check out Dan Kubb's guide. It is a good representation of the various forks and it contains some neat ideas, even if it does probably go a little too far in some areas.
Comments (1)
-
Adam Weller April 2nd, 2012 Reply Link
Great article, really like the wisdom in this:
If you are just cargo culting some rules, I would recommend actively breaking them for a time. You'll figure out pretty quick if they are important or not. If they are, you'll have the reason. If they aren't, you'll be free of a useless constraint.