<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Gray Soft / My Projects / No Longer the Fastest Game in Town</title>
  <id>tag:graysoftinc.com,2014-03-20:/posts/32</id>
  <updated>2014-04-04T21:54:13Z</updated>
  <link rel="self" href="http://graysoftinc.com/my-projects/no-longer-the-fastest-game-in-town/feed.xml"/>
  <link rel="alternate" href="http://graysoftinc.com/my-projects/no-longer-the-fastest-game-in-town"/>
  <author>
    <name>James Edward Gray II</name>
  </author>
  <entry>
    <title>The 2nd Comment on "No Longer the Fastest Game in Town"</title>
    <link rel="alternate" href="http://graysoftinc.com/my-projects/no-longer-the-fastest-game-in-town#comment_106"/>
    <id>tag:graysoftinc.com,2007-04-18:/comments/106</id>
    <updated>2014-04-04T21:54:13Z</updated>
    <summary>Oops.  Good catch.  I have corrected the article.</summary>
    <content type="html">&lt;p&gt;Oops.  Good catch.  I have corrected the article.&lt;/p&gt;</content>
    <author>
      <name>James Edward Gray II</name>
    </author>
  </entry>
  <entry>
    <title>The 1st Comment on "No Longer the Fastest Game in Town"</title>
    <link rel="alternate" href="http://graysoftinc.com/my-projects/no-longer-the-fastest-game-in-town#comment_105"/>
    <id>tag:graysoftinc.com,2007-04-18:/comments/105</id>
    <updated>2019-09-10T09:53:35Z</updated>
    <summary>`LightCsv` do not use `StringIO`.
It use `StringScanner`.</summary>
    <content type="html">&lt;p&gt;&lt;code&gt;LightCsv&lt;/code&gt; do not use &lt;code&gt;StringIO&lt;/code&gt;.&lt;br&gt;
It use &lt;code&gt;StringScanner&lt;/code&gt;.&lt;/p&gt;</content>
    <author>
      <name>tommy</name>
    </author>
  </entry>
  <entry>
    <title>No Longer the Fastest Game in Town</title>
    <link rel="alternate" href="http://graysoftinc.com/my-projects/no-longer-the-fastest-game-in-town"/>
    <id>tag:graysoftinc.com,2007-04-16:/posts/32</id>
    <updated>2019-09-10T09:53:35Z</updated>
    <summary>Some speed demons are faster than me.  Let's take a look at the new efforts in CSV parsing.</summary>
    <content type="html">&lt;p&gt;If your number one concern when working with CSV data in Ruby is raw speed, you might want to know that FasterCSV is no longer the fastest option.&lt;/p&gt;

&lt;p&gt;There are a couple of new contenders for Ruby CSV processing including a C extension called &lt;a href="https://rubygems.org/gems/simplecsv"&gt;SimpleCSV&lt;/a&gt; and a pure Ruby library called &lt;a href="https://github.com/smulube/lightcsv"&gt;LightCsv&lt;/a&gt;.  I haven't been able to test &lt;code&gt;SimpleCSV&lt;/code&gt; locally, because I can't get it to build on my box, but users do tell me it's faster.  I have run some trivial benchmarks for &lt;code&gt;LightCsv&lt;/code&gt; though and it too is pretty quick:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ rake benchmark
(in /Users/james/Documents/faster_csv)
time ruby -r csv -e '6.times { CSV.foreach("test/test_data.csv") { |row| } }'

real    0m5.481s
user    0m5.468s
sys     0m0.010s
time ruby -r lightcsv -e \
'6.times { LightCsv.foreach("test/test_data.csv") { |row| } }'

real    0m0.358s
user    0m0.349s
sys     0m0.008s
time ruby -r lib/faster_csv -e \
'6.times { FasterCSV.foreach("test/test_data.csv") { |row| } }'

real    0m0.742s
user    0m0.732s
sys     0m0.009s
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It's important to note that &lt;code&gt;LightCsv&lt;/code&gt; is indeed very "light."  &lt;code&gt;FasterCSV&lt;/code&gt; has grown up into a feature rich library that provides many different ways to look at your data.  In contrast, &lt;code&gt;LightCsv&lt;/code&gt; doesn't yet allow you to set column or row separators.  Given that, it's only an option for vanilla CSV you just need to iterate over.  If that's what you have though, and speed counts, it might just be the right choice.&lt;/p&gt;

&lt;p&gt;For the curious, &lt;code&gt;LightCsv&lt;/code&gt; achieves its speed advantage in two ways.  First, it uses &lt;code&gt;StringScanner&lt;/code&gt; to manage the parsing.  &lt;code&gt;StringScanner&lt;/code&gt; is a C extension, though it is a standard library installed with Ruby.&lt;/p&gt;

&lt;p&gt;More importantly, I suspect, &lt;code&gt;LightCsv&lt;/code&gt; uses an input buffer for reading while &lt;code&gt;FasterCSV&lt;/code&gt; works line by line.  I suspect this second difference accounts for the majority of the speed increase since the buffered code will hit the hard drive quite a bit less for the average CSV file.  This does require more memory though, of course.&lt;/p&gt;

&lt;p&gt;Aside from these differences, &lt;code&gt;FasterCSV&lt;/code&gt; and &lt;code&gt;LightCsv&lt;/code&gt; have very similar parsers.&lt;/p&gt;</content>
    <author>
      <name>James Edward Gray II</name>
    </author>
  </entry>
</feed>
