21
DEC2011
Refactoring: rcat
I use these Rubies in the Rough articles to teach how I think about code. Well, I have a scary admission to make: I didn't really understand refactoring until I was many years into being a programmer. Sure, I knew what it meant, but I just didn't get it. I hope to save you from the same mistake.
Refactoring is important. Very important. It may be one of the most important things we do as programmers. If I learned one thing from reading Smalltalk Best Practice Patterns, it's that code's primary purpose is to communicate with the reader. Let's face it though, when we are trying to get something working, it's often like stumbling around in the dark. We are running into all kinds of things, breaking stuff, and just trying to reach that "Holy cow it works!" moment. We're probably not thinking too long and hard about how well this mess we are making communicates and that's perfectly fine.
Refactoring is where you get to fix that. It's about taking working code and making it sexy. Note that I said it starts with working code. Until you have that, there's nothing worth communicating to a potential reader. Make it work, then make it sexy. (I believe that saying really involves speed, but that's a very different conversation we can have at a later date.)
Given the importance I place on this topic, refactoring will be a recurring theme in these articles. Expect to regularly see articles where we try to clean some code up. It's important that we practice this process and get good at it.
rcat
In Practicing Ruby 2.9, Gregory Brown builds a clone of the Unix cat
program in pure Ruby. He calls this clone rcat
. It's a great article and I recommend that you take a moment to read it before you finish reading this.
It turns out that the code for rcat
is a great refactoring exercise for several reasons:
- Most of us are probably familiar with what it does
- It's working code
- It has tests, but they are a little unusual
- It's pretty darn good code
Did the last point surprise you? Yeah, I said it's good code. That's a win for us.
Hideous code is hard. You have to start by learning a lot about it. You just can't start editing it until you figure out which strings are safe to pull on and which are too risky. Then you have to attack it with the utmost care, baby stepping your way from safe foothold to safe foothold and verifying the heck out of what you are doing.
I'm not saying those skills aren't useful. Of course they are. It's a complex process though and I don't want to take us there on our first outing. We will get to that down the road a bit.
For now, let's look at some code that does a lot of things right. Most code could be better in some areas and rcat
is no different. It makes a couple of minor missteps and we should be able to safely work on improving those.
You First
That's the task. We're going to try making some minor improvements to rcat
.
I want you to work on a specific version of rcat
: the one initially released with the article. That's the way I first saw it. Greg has made a couple of minor improvements to it since then and I would rather we had the chance to spot those issues too. So, here's how you can get a copy of the code locally and have it be on the version I want you to see:
$ git clone https://github.com/elm-city-craftworks/rcat.git
Cloning into rcat...
remote: Counting objects: 85, done.
remote: Compressing objects: 100% (50/50), done.
remote: Total 85 (delta 26), reused 82 (delta 23)
Unpacking objects: 100% (85/85), done.
$ cd rcat
$ git branch as_released e28f2c
$ git checkout as_released
Switched to branch 'as_released'
Now, when you see these refactoring articles, you can do what you always do an just read along. I encourage you not to do that. You may learn something that way, but you just won't learn as much as you could. Instead, take a swing at the problem yourself. You probably won't do what I do and that will be just great. That will help you see the differences between what you focus on and what I focus on. If we do attack the same areas, then you will get to see the differences in how we go about it.
In this particular example, because it is good code, it will just be great to see if you can find the problems I have with it. They are fairly minor points, so consider this a test of your observation skills. Even if you don't do the edits before you see mine, just trying to spot what I'm going to fix could be helpful.
So, run the commands above to get the code and take a look through it before you read on. If you are feeling really adventurous, try making some fixes yourself.
That One Bump in the Testing Road
We're in luck, rcat
has tests!
These tests are a little unusual in that they don't use a testing framework. It's just a file of tests that Greg wrote, comparing rcat
to cat
in several different scenarios. That makes me want to try the obvious:
$ ruby tests.rb
tests.rb:20:in ``': No such file or directory - rcat … (Errno::ENOENT)
from tests.rb:20:in `<main>'
Doh! We were doing so well too.
I must admit, this is totally normal for my running-other-people's-test track record. There always seems to be one non-obvious requirement to run them. For example, they may run but all error out while trying to connect to Redis. Then I fire up Redis and they work fine. It's sad that this is the norm, because we really want it to be trivial for everyone to run our tests!
I guess I have found the first thing I want to fix.
If you read the Practicing Ruby article that described rcat
you'll see that it talked about Greg setting an environment variable to get the rcat
executable in his PATH
. That tells us what the problem probably is, but we can't use Greg's exact suggestion since it's an absolute path on his machine. I tried translating it to a relative path that would work for me to see if I can get the tests working:
$ PATH="./bin:$PATH" ruby tests.rb
You passed the tests, yay!
Bingo. All I did was update my PATH
variable for the command that followed by prepending the ./bin
directory that holds the rcat
executable. Now we know how to get the tests to work.
Really, the tests should do that for us. They know where they are and where rcat
is. They should just handle this. Let's make that change.
At this point, it's time to actually peak at the tests.rb
file. Skipping some comments at the top, this is how the code begins:
require "open3"
working_dir = File.dirname(__FILE__)
gettysburg_file = "#{working_dir}/data/gettysburg.txt"
spaced_file = "#{working_dir}/data/spaced_out.txt"
############################################################################
cat_output = `cat #{gettysburg_file}`
rcat_output = `rcat #{gettysburg_file}`
fail "Failed 'cat == rcat'" unless cat_output == rcat_output
############################################################################
cat_output = `cat #{gettysburg_file} #{spaced_file}`
rcat_output = `rcat #{gettysburg_file} #{spaced_file}`
fail "Failed 'cat [f1 f2] == rcat [f1 f2]'" unless cat_output == rcat_output
############################################################################
…
The first chunk shows that Greg was already aware of these path issues. He solved them for the data files by building local paths to them.
The second and third chunks show a couple of the tests. They are just shelling out to both programs and comparing the output. That's easy enough to understand.
We could solve the rcat
problem the same way Greg handles the data files, but that will require us to edit every test. It would also create a strange asymmetry:
cat_output = `cat …`
rcat_output = `#{rcat} …`
Those aren't a super big deal, but it makes things just a touch harder to understand and I would rather not do that. Given that, let's just twiddle the path as I did externally. I changed the top code to:
require "open3"
working_dir = File.dirname(__FILE__)
gettysburg_file = "#{working_dir}/data/gettysburg.txt"
spaced_file = "#{working_dir}/data/spaced_out.txt"
# make sure rcat is in the PATH
ENV["PATH"] = [File.join(working_dir, "bin"), ENV["PATH"]].compact.join(":")
Now we can just run the Ruby file normally:
$ ruby tests.rb
You passed the tests, yay!
I like that. Keep it as easy as humanly possible for users to run your tests. This helps to get them into your code and that's always a plus.
What About the Duplication?
As you look through these tests, you'll see plenty of duplication. In just the two I posted above, you can already see it.
- The same arguments are always sent
cat
andrcat
- We capture the output from both
- Output is compared and an error thrown if they don't match
Couldn't we wrap that in a method that handles those details for us? We definitely could, but it's not quite as utterly straightforward as I might have made it seem.
First, read through the error messages. They differ a bit, so we probably couldn't copy them exactly as they are. We would likely standardize them based on the arguments. I think that's fine. It may actually be an improvement since the current messages seem a little inconsistent.
But, if you keep reading down, you will see that the tests gain some variety of form. Have a peek at this later test, for example:
`cat #{gettysburg_file}`
cat_success = $?
`rcat #{gettysburg_file}`
rcat_success = $?
unless cat_success.exitstatus == 0 && rcat_success.exitstatus == 0
fail "Failed 'cat and rcat success exit codes match"
end
Also, consider this one:
cat_out, cat_err, cat_process = Open3.capture3("cat some_invalid_file")
rcat_out, rcat_err, rcat_process = Open3.capture3("rcat some_invalid_file")
unless cat_process.exitstatus == 1 && rcat_process.exitstatus == 1
fail "Failed 'cat and rcat exit codes match on bad file"
end
unless rcat_err == "rcat: No such file or directory - some_invalid_file\n"
fail "Failed 'cat and rcat error messages match on bad file'"
end
We could handle these different cases, but we're going to need to introduce some infrastructure to do it. In turn, that infrastructure is going to make the tests less easy to browse.
It seems like there are some pros and cons to making that change. Because of that, I'm inclined not to go there. I can't tell if I'm going to make the tests communicate better or worse with a change like that. When I find myself at such a crossroads, I generally opt for a "Do No Harm" strategy.
Perhaps, as more tests are added down the road, the picture will crystalize more and I'll be able to make a stronger argument for the change. If that came to pass, I would make the change.
That's why I'm electing to leave the tests as they are. It's definitely a margin call though. If you replaced them, I don't blame you.
Reading the Code
With the tests running smoothly, I'm ready to actually start reading the code. For a small executable like this, I generally just crack open the executable file itself and follow where it leads me.
Using that strategy, this is some of the first code I ran into:
begin
RCat::Application.new(ARGV).run
rescue Errno::ENOENT => err
STDERR.puts "rcat: #{err.message}"
exit(1)
rescue OptionParser::InvalidOption => err
STDERR.puts "rcat: #{err.message}"
STDERR.puts "usage: rcat [-bns] [file ...]"
exit(1)
end
That has my Spidey Sense tingling. What are these errors doing up here? This looks like parts of the code leaking into places it doesn't belong. When I'm here, I have no idea what Errno::ENOENT
or OptionParser::InvalidOption
relate to. Also, I know I link to this book in almost every freaking article, but doesn't Exceptional Ruby say that begin
is a code smell?
I'm also wondering why we are using STDERR
here instead of $stderr
. Plus those puts()
and exit()
calls could be collapsed into a calls to Ruby's abort()
. (If you read the article, you may have noticed these issues were fixed. Greg updated them after I made the same mentions to him.)
Going back to the article, does Greg give a reasoning for why these rescues are at the top level? It turns out that he does. Here's the best part:
Regardless of how these exceptions are labeled, it's important to note that I intentionally let them bubble all the way up to the outermost layer and only then rescue them and call Kernel#exit. Intermingling exit calls within control flow or modeling logic makes debugging nearly impossible and also makes automated testing a whole lot harder.
That's an interesting argument. I would not be for making testing the code harder. However, we're hurting readability significantly by moving these pieces so far from the code they relate to. We have to find a way to balance those needs.
Can I Test exit()?
What is Greg saying about testing an exit()
call? He's saying, if we call it in the middle of our test run, it's just going to halt the running test suite without giving us a chance to check anything. That would definitely be bad, but the real question is, "Is that true?"
Let's find out. I created a new file called exit_test.rb
and entered this code:
require "minitest/autorun"
class ExitTest < MiniTest::Unit::TestCase
def test_exit
exit
end
end
If we run that, we will see that Greg is right:
$ ruby exit_test.rb
Run options: --seed 5481
# Running tests:
That's not good. It killed things before we could see some output.
I'll be honest and tell you that's not what I expected to see. I know that exit()
just raises an Exception
to end the program. I also know that testing frameworks catch any Exception
raised during the run so that they can report them in the results. I expected it to catch the Exception
and keep it from ending the program. I had to look at the source code to see why it wasn't doing that.
The answer is the while MiniTest
does catch any Exception
raised, it reraises some of them. It just so happens that the list of those that get reraised includes SystemExit
, which is what exit()
raises. That makes a lot of sense. If MiniTest
didn't do that, exit()
wouldn't work like exit()
while it was running and that might make us mad if we were trying to use it to debug a problem.
Greg is a smart guy. Thanks for the lesson, buddy.
However, just because MiniTest
doesn't automatically catch the error for us, doesn't mean we can't handle it. In fact, MiniTest
will catch it for us, if we ask nicely. Watch this:
require "minitest/autorun"
class ExitTest < MiniTest::Unit::TestCase
def test_exit
assert_raises(SystemExit) do
exit
end
end
end
If we run that, we can see that our test works fine:
$ ruby exit_test.rb
Run options: --seed 50521
# Running tests:
.
Finished tests in 0.000427s, 2341.9204 tests/s, 2341.9204 assertions/s.
1 tests, 1 assertions, 0 failures, 0 errors, 0 skips
Now we're getting somewhere. Could we use this to test a method that sometimes calls exit()
? Sure:
def good_and_bad(result)
result == :good ? 42 : exit
end
require "minitest/autorun"
class GoodAndBadTest < MiniTest::Unit::TestCase
def test_good
assert_equal(42, good_and_bad(:good))
end
def test_bad
assert_raises(SystemExit) do
good_and_bad(:bad)
end
end
end
That works just as you would expect:
$ ruby good_and_bad_test.rb
Run options: --seed 50819
# Running tests:
..
Finished tests in 0.000468s, 4273.5043 tests/s, 4273.5043 assertions/s.
2 tests, 2 assertions, 0 failures, 0 errors, 0 skips
So, while Greg is right that working with exit()
in a test suite can have nasty consequences, that does not mean we can't test those methods. I think that tips the scales for me. I just feel it's too important to get that error handling into the right place. I make a note that it needs moving and keep reading the code.
Everything in its Place
If you can remember back to before I got off on that testing tangent, we saw the executable hand off to the library code like this: RCat::Application.new(ARGV).run
. That makes me want to get a look at the Application
object. I'm not going to dump that code in here, but I noticed two things when reading it:
- It's gorgeous. I have no complaints.
- It's also the the exact place I was looking for to hang those wayward errors!
Let me show you the code as I saw it:
module RCat
class Application
def initialize(argv)
# setup code, safe to ignore...
end
def run
# open files that could toss Errno::ENOENT here...
end
def parse_options(argv)
# parse options with OptionParser here...
end
end
end
Now watch how it looks with the errors moved into here:
module RCat
class Application
def initialize(argv)
# setup code, safe to ignore...
end
def run
# open files that could toss Errno::ENOENT here...
rescue Errno::ENOENT => error
# handle Errno::ENOENT here...
end
def parse_options(argv)
# parse options with OptionParser here...
rescue OptionParser::InvalidOption => error
# handle OptionParser::InvalidOption here...
end
end
end
Burn that pattern into your brain, because it's what great error handling looks like. This code already had perfect compartmentalization into methods that just do one thing. All we did was tack the error handling for those things onto those methods. You have the "happy path" first, which is the way code should be. Then, if trouble arises, you just take the low road. It's also worth noting that there's not a begin
in site.
I can understand Greg's desire not to work with ordinary methods that occasionally try to halt everything. However, these aren't ordinary methods. These are the tasks handled by the Application
object. Tasks like run()
. Being able to stop everything is totally in the scope of what we are doing here.
Furthermore, look at the new and improved executable:
#!/usr/bin/env ruby
require_relative "../lib/rcat"
RCat::Application.new(ARGV).run
That's, in a word, ideal. This file is now just an interface to our library code. It just kicks off a process. It doesn't do any work. That allows us to safely forget about it. If we write proper tests, we will know this code is just fine.
I made the real move of the content, not just the comments shown above, and ran the tests. They passed, so we're looking great with our second change.
Thrown for a Loop
There are three code files left that I haven't gone inside of yet. However, one just requires some files and the other just sets a constant. So really, we are down to one interesting stop left: lib/rcat/display
.
The first thing I notice in here is that we're in a Display
object, but doing things like render()
. It might be worthwhile to rename this thing to a Renderer
for consistent terminology, but this is a minor point and I'm not going to worry about it now.
There is a method here that concerns me, render_line()
. Here's the code:
def render_line(lines)
current_line = lines.next
current_line_is_blank = current_line.chomp.empty?
case line_numbering_style
when :all_lines
print_labeled_line(current_line)
increment_line_number
when :significant_lines
if current_line_is_blank
print_unlabeled_line(current_line)
else
print_labeled_line(current_line)
increment_line_number
end
else
print_unlabeled_line(current_line)
increment_line_number
end
if squeeze_extra_newlines && current_line_is_blank
lines.next while lines.peek.chomp.empty?
end
end
I see some warning signs in this:
- The method is getting a bit long
- It's called
render_line()
but it operates on a set oflines
(printing one and possibly discarding others) - There's some duplication, like the two places where it prints a labeled line and then increments the line number
- The nested conditionals make it hard for me to reason about
- It's part of some flow control you can't even see by reading it
To explain that last point, this code works by calling next()
on the passed Enumerator
. That method will raise StopIteration
when the list of items is exhausted. So that raise is happening here, either on the first line or the next-to-last line. But it's being handled elsewhere, in fact, the render()
method rescues it:
def render(data)
@line_number = 1
lines = data.lines
loop { render_line(lines) }
end
Do you see a rescue
clause there? It's a bit hidden. In Ruby 1.9, loop()
was enhanced to quietly catch StopIteration
and use it as a sign to halt the iteration. I'm fine with that in practice, but once you break it across two methods, like it is in this case, it seems to hide some pretty deep magic.
I want to try to fix this section, but I need to do it in pieces. First, I want to address the confusing flow control. Greg talks about the reason for selecting an Enumerator
in his article and it definitely makes sense, but I'm hoping that reverting to a normal each()
can help me clean up the flow. I'm going to try that.
The hard part about switching to each()
is this chunk of code towards the end of render_line()
:
if squeeze_extra_newlines && current_line_is_blank
lines.next while lines.peek.chomp.empty?
end
We have to think about what that is doing. It's looking forward at upcoming lines and discarding those lines as long as they are don't need to be shown. The trick is to reverse that. If we remove the need to see into the future, each()
will work just fine. Without peek()
though, it's going to hand us some lines we don't want. We will have to skip over those before we process them. Put another way, if we move this code to the head of the loop, we can drop the peek()
and switch to Ruby's normal next
keyword.
This should allow me to get the process into one normal iterator. Given that, I plan to drop render_line()
altogether. I'm hoping to get render()
clean enough that it won't be needed. Here's my first pass at switching to a normal iterator:
def render(data)
@line_number = 1
data.each do |current_line|
current_line_is_blank = current_line.chomp.empty?
if squeeze_extra_newlines && current_line_is_blank
next
end
case line_numbering_style
when :all_lines
print_labeled_line(current_line)
increment_line_number
when :significant_lines
if current_line_is_blank
print_unlabeled_line(current_line)
else
print_labeled_line(current_line)
increment_line_number
end
else
print_unlabeled_line(current_line)
increment_line_number
end
end
end
I'm liking that flow better, but don't get too excited. Greg's tests are giving me the thumbs down for this version:
$ ruby tests.rb
tests.rb:62:in `<main>': Failed 'cat -s == rcat -s' (RuntimeError)
This proves I didn't understand the peek()
and next()
combo as well as I thought I did. Things like this are why it's so critical to have working tests while you are refactoring.
OK, so what's the problem here? Well, it's because this code use to be at the end of the loop, meaning that it was checked after a line was processed. That means the first blank line made it through. Then it stopped the rest. Under my new setup though, I stop all of them. No good.
I need to track another state variable to fix that. This code will do the trick:
def render(data)
@line_number = 1
in_blank_run = false
data.each do |current_line|
current_line_is_blank = current_line.chomp.empty?
if squeeze_extra_newlines && in_blank_run && current_line_is_blank
next
end
in_blank_run = current_line_is_blank
case line_numbering_style
when :all_lines
print_labeled_line(current_line)
increment_line_number
when :significant_lines
if current_line_is_blank
print_unlabeled_line(current_line)
else
print_labeled_line(current_line)
increment_line_number
end
else
print_unlabeled_line(current_line)
increment_line_number
end
end
end
That does indeed get the tests passing again.
Now, at first glance, it probably looks like I have made things worse. That's a little true. However, I mostly have the code in one place now and that makes it easier to do extractions. I'm betting that will help me clean things up even more.
I said "mostly have the code in one place" because there's an instance variable involved: @line_number
. That means other methods can modify it on me. It turns out they do. In fact, that's increment_line_number()
's job. I'm going to inline that method and switch to using a local variable to make sure I have this code under my control:
def render(data)
line_number = 1
in_blank_run = false
data.each do |current_line|
current_line_is_blank = current_line.chomp.empty?
if squeeze_extra_newlines && in_blank_run && current_line_is_blank
next
end
in_blank_run = current_line_is_blank
case line_numbering_style
when :all_lines
print_labeled_line(line_number, current_line)
line_number += 1
when :significant_lines
if current_line_is_blank
print_unlabeled_line(current_line)
else
print_labeled_line(line_number, current_line)
line_number += 1
end
else
print_unlabeled_line(current_line)
line_number += 1
end
end
end
I also had to update print_labeled_line()
, so I could pass it the label:
def print_labeled_line(label, line)
print "%6d\t%s" % [label, line]
end
The tests still pass and I now have all the code in one place. I'm ready to start making this code pretty.
Those two state variables are tipping me off that a new object is called for here. That's what objects do, track state, so that's a way to predict that they are needed. I'll start by building an object that can track the line numbers. I was lazy and just put this new object in the same file for now. I can graduate it to its own file when the time comes.
class RendererStatus
def initialize
@lineno = 1
end
attr_reader :lineno
def increment
@lineno += 1
end
end
def render(data)
in_blank_run = false
data.each_with_object(RendererStatus.new) do |current_line, status|
current_line_is_blank = current_line.chomp.empty?
if squeeze_extra_newlines && in_blank_run && current_line_is_blank
next
end
in_blank_run = current_line_is_blank
case line_numbering_style
when :all_lines
print_labeled_line(status.lineno, current_line)
status.increment
when :significant_lines
if current_line_is_blank
print_unlabeled_line(current_line)
else
print_labeled_line(status.lineno, current_line)
status.increment
end
else
print_unlabeled_line(current_line)
status.increment
end
end
end
The RendererStatus
object is trivial at this point. It just tracks a counter. But this gives me a place to hang status questions in the future and that should help me clean things up. Note that I push it through my loop using the new each_with_object()
iterator in Ruby 1.9.
Now, it's time to start busting up that central case
statement. The main problem with it is that it's doing two separate things: printing content and bumping the counter. I want to handle those separately, to cut out the repetition. Here's the split:
def render(data)
in_blank_run = false
data.each_with_object(RendererStatus.new) do |current_line, status|
current_line_is_blank = current_line.chomp.empty?
if squeeze_extra_newlines && in_blank_run && current_line_is_blank
next
end
in_blank_run = current_line_is_blank
case line_numbering_style
when :all_lines
print_labeled_line(status.lineno, current_line)
when :significant_lines
if current_line_is_blank
print_unlabeled_line(current_line)
else
print_labeled_line(status.lineno, current_line)
end
else
print_unlabeled_line(current_line)
end
unless line_numbering_style == :significant_lines &&
current_line_is_blank
status.increment
end
end
end
Now that they are separate, it's also easy to collapse the nested case
/if
logic:
def render(data)
in_blank_run = false
data.each_with_object(RendererStatus.new) do |current_line, status|
current_line_is_blank = current_line.chomp.empty?
if squeeze_extra_newlines && in_blank_run && current_line_is_blank
next
end
in_blank_run = current_line_is_blank
if line_numbering_style == :all_lines ||
( line_numbering_style == :significant_lines &&
!current_line_is_blank )
print_labeled_line(status.lineno, current_line)
else
print_unlabeled_line(current_line)
end
unless line_numbering_style == :significant_lines &&
current_line_is_blank
status.increment
end
end
end
That's better, but these conditional checks aren't very sexy. Let's add a couple of query methods and clean those up:
def squeeze_extra_newlines?
@squeeze_extra_newlines
end
def number_all_lines?
@line_numbering_style == :all_lines
end
def number_significant_lines?
@line_numbering_style == :significant_lines
end
def render(data)
in_blank_run = false
data.each_with_object(RendererStatus.new) do |current_line, status|
current_line_is_blank = current_line.chomp.empty?
next if squeeze_extra_newlines? &&
in_blank_run &&
current_line_is_blank
in_blank_run = current_line_is_blank
if number_all_lines? || ( number_significant_lines? &&
!current_line_is_blank )
print_labeled_line(status.lineno, current_line)
else
print_unlabeled_line(current_line)
end
unless number_significant_lines? && current_line_is_blank
status.increment
end
end
end
The tests are still passing and the code is getting a lot easier to understand. In fact, if you look at what's still not gorgeous, it all has to do with the blank line tracking. We already know where that state belongs too. We built an object just for that purpose. Let's fold that last variable into our status tracking:
class RendererStatus
def initialize
@lineno = 0
@in_blank_run = false
@significant = true
end
attr_reader :lineno
def in_blank_run?
@in_blank_run
end
def significant?
@significant
end
def insignificant?
!significant?
end
def extra_newline?
in_blank_run? && insignificant?
end
def update(line, squeeze, only_count_significant)
@in_blank_run = !significant?
@significant = !line.chomp.empty?
@lineno += 1 unless (squeeze && extra_newline?) ||
(only_count_significant && insignificant?)
end
end
That's becoming a fair bit of code, but note that it's all trivial. This just answers very basic questions and the method names are such that you can probably understand the checks without even reading the implementations.
The proof is in the pudding though. Look what that does to our iterator:
def render(data)
data.each_with_object(RendererStatus.new) do |line, status|
status.update( line, squeeze_extra_newlines?,
number_significant_lines? )
next if squeeze_extra_newlines? && status.extra_newline?
if number_all_lines? || ( number_significant_lines? &&
status.significant? )
print_labeled_line(status.lineno, line)
else
print_unlabeled_line(line)
end
end
end
I shortened current_line
to line
, added a call to RendererStatus#update()
, and switched the remaining references to use the status
object. I'll let you be the judge, but I find this version easier to digest. I can almost read what it does in plain English and I like that. When I do need to peek at a method, it's very easy to understand. It would also be easier to cover in unit tests.
Having added a RendererStatus
object makes the choice of the Display
name stick out even more. That made me decide that it was worth it to rename it to a Renderer
. There are no surprises in that process, so I won't bore you by showing more code.
Try, Try Again
This is the second time I have tried this refactoring. I did it once, right after Greg released the code. You can find the results of that in the master branch of my fork on GitHub.
I started over for this article and I think I did an even better job this time. I've pushed these changes into a ritr_refactoring branch in that same repository. You can take a look at that to see the final product or to see how I organized the commits.
The point is, I got better by retrying this exercise. You will too. Always be refactoring code, even code you have refactored before. You'll learn more as you go and hopefully make the code easier on the eyes with every pass. That's our goal!
Comments (5)
-
Gregory Brown January 13th, 2012 Reply Link
Thanks for writing this one James. It's great to see a Practicing Ruby article cited elsewhere, with the code from it improved. Agreed with many of the suggestions you made here, with the possible exception of capturing
SystemExit
in tests.The point I was trying to make about testing being difficult when you have
exit
calls buried in your call stack is that it makes manual testing hard too. I've scratched my head a hundred times trying to figure out why some code wasn't doing what I expected, only to find out that anexit
call was to blame. An interesting thing I learned from this article is that attempting torescue SystemExit
might shed light on that sort of heisenbug, but I'd still prefer to never create it for the user in the first place :) -
James,
(I sent you this in an email and you encouraged me to share it on the site, so here goes :))
I was a bit surprised by the way you refactored
rcat
. Not that I didn't like it, and I definitely learned from it. But I had expected you to go for a more object oriented design, splitting the responsibilities into separate classes.In case you're interested, I had a go at using decorators for the refactoring. Here is my revised
Display
class:class Display def initialize(params) @line_numbering_style = params[:line_numbering_style] @squeeze_extra_newlines = params[:squeeze_extra_newlines] end def create_printer printer = LinePrinter.new printer = LineNumberer.new(printer, :mode => @line_numbering_style) if @line_numbering_style printer = ExtraNewlineSqueezer.new(printer) if @squeeze_extra_newlines printer end def render(data) printer = create_printer data.lines.each do |line| printer.print_line(line) end end end
The complete file including decorators is at https://gist.github.com/1565178 and the repo is at https://github.com/chopmo/rcat.
-
Everyone should take a careful look at Jacob's work. It's a very clean refactoring that shows off the beauty of a good object oriented design.
-
Jacob:
Nice work. I think the general idea of using decorators would definitely clean up that code. However, the extensive use of
super()
makes me wonder if a composition based approach would have been better than inheritance here.In any case, thanks for working on this! It's fun to see all the different approaches to this problem.
-
Right, now I see what you mean. Would be interesting to try using composition instead, maybe I'll revisit this some day. And thanks to you for the original code!
-
-