Gray Soft / Key-Value Storestag:graysoftinc.com,2014-03-20:/categories/192014-04-19T02:04:47ZJames Edward Gray IITokyo Cabinet's Key-Value Database Typestag:graysoftinc.com,2010-01-10:/posts/932014-04-19T02:04:47ZThis article digs deeper into the capabilities of Tokyo Cabinet. B+Tree and Fixed-length Databases are discussed, as well as how to tune the database types to your specific needs.<p>We've taken a good look at Tokyo Cabinet's Hash Database, but there's a lot more to the library than just that. Tokyo Cabinet supports three other kinds of databases. In addition, each database type accepts various tuning parameters that can be used to change its behavior. Each database type and setting involves different tradeoffs so you really have a lot of options for turning Tokyo Cabinet into exactly what you need. Let's look into some of those options now.</p>
<h4>The B+Tree Database</h4>
<p>Tokyo Cabinet's B+Tree Database is a little slower than the Hash Database we looked at before. That's its downside. However, giving up a little speed gains you several extra features that may just allow you to work smarter instead of faster.</p>
<p>The B+Tree Database is a more advanced form of the Hash Database. What that means is that all of the stuff I showed you in the last article still applies. You can set, read, and remove values by keys, iteration is supported, and you still have access to the neat options like adding to counters. With a B+Tree Database you get all of that and more.</p>
<p>The first major addition is that a B+Tree Database is ordered. You don't really need to do anything to turn this on, it's just the way it is. As you add pairs to the database, they will be ordered by the keys you use. The default ordering is lexical:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"ordered.tcb"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:c</span><span class="o">]</span> <span class="o">=</span> <span class="mi">3</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:a</span><span class="o">]</span> <span class="o">=</span> <span class="mi">1</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:b</span><span class="o">]</span> <span class="o">=</span> <span class="mi">2</span>
<span class="n">db</span><span class="o">.</span><span class="n">to_a</span> <span class="c1"># => [["a", "1"], ["b", "2"], ["c", "3"]]</span>
<span class="k">end</span>
</pre></div>
<p>This simple example shows us a couple of things. First, creating a B+Tree Database is as simple as changing the file extension. Remember when I said the <code>.tch</code> stood for <b>T</b>okyo <b>C</b>abinet <b>H</b>ash Database? Well, it shouldn't be too surprising that <code>.tcb</code> stands for <b>T</b>okyo <b>C</b>abinet <b>B</b>+Tree Database. Oklahoma Mixer will notice which extension you use and load the right features for that database type.</p>
<p>The other thing to notice here is the ordering. I purposely added the keys out of order, but you can see that <code>to_a()</code> shows them all lined up correctly. Now <code>to_a()</code> is really just an iterator the database object inherits from <code>Enumerable</code>, so we now know that iteration will be in database order. Methods like <code>keys()</code> and even <code>values()</code> will also return their listings in order as well.</p>
<p>As I said, the default ordering is lexical, so number keys are little strange:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"lexical.tcb"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">db</span><span class="o">[</span><span class="mi">1</span><span class="o">]</span> <span class="o">=</span> <span class="ss">:first</span>
<span class="n">db</span><span class="o">[</span><span class="mi">2</span><span class="o">]</span> <span class="o">=</span> <span class="ss">:middle</span>
<span class="n">db</span><span class="o">[</span><span class="mi">11</span><span class="o">]</span> <span class="o">=</span> <span class="ss">:last</span>
<span class="n">db</span><span class="o">.</span><span class="n">to_a</span> <span class="c1"># => [["1", "first"], ["11", "last"], ["2", "middle"]]</span>
<span class="k">end</span>
</pre></div>
<p>Notice they don't come out in the order we would probably think is most natural, as I described in the values. To fix that we need to change the default ordering and you can do that using a tuning parameter of the B+Tree Database. We are allowed to set a <em>comparison function</em> when we <code>open()</code> the database that will order the keys however we desire. This function is just like a block you would pass to <code>sort()</code> in Ruby: it will be handed two keys at a time to compare and it is expected to return negative, zero, or positive for the first argument being less than, equal to, or greater than the second. The good news is, you can usually cheat your way out of remembering these comparison rules by leaning on Ruby's <em>spaceship operator</em>:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span> <span class="s2">"numerical.tcb"</span><span class="p">,</span>
<span class="ss">:cmpfunc</span> <span class="o">=></span> <span class="nb">lambda</span> <span class="p">{</span> <span class="o">|</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="o">|</span> <span class="n">a</span><span class="o">.</span><span class="n">to_i</span> <span class="o"><=></span> <span class="n">b</span><span class="o">.</span><span class="n">to_i</span> <span class="p">}</span> <span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">db</span><span class="o">[</span><span class="mi">1</span><span class="o">]</span> <span class="o">=</span> <span class="ss">:first</span>
<span class="n">db</span><span class="o">[</span><span class="mi">2</span><span class="o">]</span> <span class="o">=</span> <span class="ss">:middle</span>
<span class="n">db</span><span class="o">[</span><span class="mi">11</span><span class="o">]</span> <span class="o">=</span> <span class="ss">:last</span>
<span class="n">db</span><span class="o">.</span><span class="n">to_a</span> <span class="c1"># => [["1", "first"], ["2", "middle"], ["11", "last"]]</span>
<span class="k">end</span>
</pre></div>
<p>This example shows how tuning parameters get set with Oklahoma Mixer. Just pass some keyword arguments to <code>open()</code> for each parameter you need to adjust. This allows Oklahoma Mixer to perform the needed setup before connecting to your database. That's critical for things like a B+Tree comparison function that have to be set before the database is accepting data.</p>
<p>It's worth noting that the comparison function is not stored in the database file and needs to be reset (to the same function if you want to avoid unpredictable results) each time you <code>open()</code> that database.</p>
<p>OK, enough about ordering. What else do we get with the B+Tree Database?</p>
<p>You also get key ranges. Since the database has an inherit order, we're no longer limited to <code>:prefix</code> searches of the <code>keys()</code> and we can now ask for all of the keys between two endpoints:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"ranges.tcb"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">db</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="ss">:a</span> <span class="o">=></span> <span class="mi">1</span><span class="p">,</span> <span class="ss">:b</span> <span class="o">=></span> <span class="mi">2</span><span class="p">,</span> <span class="ss">:c</span> <span class="o">=></span> <span class="mi">3</span><span class="p">,</span> <span class="ss">:d</span> <span class="o">=></span> <span class="mi">4</span><span class="p">,</span> <span class="ss">:e</span> <span class="o">=></span> <span class="mi">5</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">keys</span><span class="p">(</span><span class="ss">:range</span> <span class="o">=></span> <span class="s2">"ab"</span><span class="o">.</span><span class="n">.</span><span class="s2">"d"</span><span class="p">)</span> <span class="c1"># => ["b", "c", "d"]</span>
<span class="n">db</span><span class="o">.</span><span class="n">keys</span><span class="p">(</span><span class="ss">:range</span> <span class="o">=></span> <span class="s2">"ab"</span><span class="o">.</span><span class="n">.</span><span class="o">.</span><span class="s2">"d"</span><span class="p">)</span> <span class="c1"># => ["b", "c"]</span>
<span class="k">end</span>
</pre></div>
<p>Note that I used <code>"ab"</code> in my <code>Range</code> queries which is really between the actual <code>"a"</code> and <code>"b"</code> keys in the database. That works just fine.</p>
<p>You can also pass the <code>:limit</code> option I've shown before with <code>:range</code>, but you can't pass <code>:prefix</code>. It's one or the other: <code>:prefix</code> or <code>:range</code>.</p>
<p>This ability to work with a <code>Range</code> of keys is even extended to the iterators. You've always had the ability to stop iterating whenever you like using Ruby's <code>break</code> keyword, but now you can tell the iterators where to start, making it possible to iterate over a subset of the pairs in the database:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"ranges.tcb"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">db</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="ss">:a</span> <span class="o">=></span> <span class="mi">1</span><span class="p">,</span> <span class="ss">:b</span> <span class="o">=></span> <span class="mi">2</span><span class="p">,</span> <span class="ss">:c</span> <span class="o">=></span> <span class="mi">3</span><span class="p">,</span> <span class="ss">:d</span> <span class="o">=></span> <span class="mi">4</span><span class="p">,</span> <span class="ss">:e</span> <span class="o">=></span> <span class="mi">5</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">each</span><span class="p">(</span><span class="s2">"ab"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">key</span><span class="p">,</span> <span class="n">value</span><span class="o">|</span>
<span class="nb">puts</span> <span class="s2">"%p => %p"</span> <span class="o">%</span> <span class="o">[</span><span class="n">key</span><span class="p">,</span> <span class="n">value</span><span class="o">]</span>
<span class="k">break</span> <span class="k">if</span> <span class="n">key</span> <span class="o">>=</span> <span class="s2">"d"</span>
<span class="k">end</span>
<span class="c1"># >> "b" => "2"</span>
<span class="c1"># >> "c" => "3"</span>
<span class="c1"># >> "d" => "4"</span>
<span class="k">end</span>
</pre></div>
<p>Again, I used <code>"ab"</code> and it jumped to the first key after that. The only place that might get a little confusing is if you try that same trick with the (also added to B+Tree Databases) <code>reverse_each()</code> iterator:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"ranges.tcb"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">db</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="ss">:a</span> <span class="o">=></span> <span class="mi">1</span><span class="p">,</span> <span class="ss">:b</span> <span class="o">=></span> <span class="mi">2</span><span class="p">,</span> <span class="ss">:c</span> <span class="o">=></span> <span class="mi">3</span><span class="p">,</span> <span class="ss">:d</span> <span class="o">=></span> <span class="mi">4</span><span class="p">,</span> <span class="ss">:e</span> <span class="o">=></span> <span class="mi">5</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">reverse_each</span><span class="p">(</span><span class="s2">"ddd"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">key</span><span class="p">,</span> <span class="n">value</span><span class="o">|</span>
<span class="nb">puts</span> <span class="s2">"%p => %p"</span> <span class="o">%</span> <span class="o">[</span><span class="n">key</span><span class="p">,</span> <span class="n">value</span><span class="o">]</span>
<span class="k">break</span> <span class="k">if</span> <span class="n">key</span> <span class="o"><=</span> <span class="s2">"b"</span>
<span class="k">end</span>
<span class="c1"># >> "e" => "5"</span>
<span class="c1"># >> "d" => "4"</span>
<span class="c1"># >> "c" => "3"</span>
<span class="c1"># >> "b" => "2"</span>
<span class="k">end</span>
</pre></div>
<p>See how it started with <code>"e"</code>? It always jumps to the first key equal to or <em>after</em> the one you provide, even if you are planning to iterate backwards. Since <code>"ddd"</code> is between <code>"d"</code> and <code>"e"</code>, that means we start on the key after <code>"ddd"</code> (<code>"e"</code>).</p>
<p>B+Tree Databases have one more feature and it's a wild one. These databases support an additional storage mode that allows duplicate values to be stored under the same key:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"dupes.tcb"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="sx">%w[James Dana Baby]</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="nb">name</span><span class="o">|</span>
<span class="n">db</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="s2">"Gray"</span><span class="p">,</span> <span class="nb">name</span><span class="p">,</span> <span class="ss">:dup</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">db</span><span class="o">.</span><span class="n">to_a</span> <span class="c1"># => [["Gray", "James"], ["Gray", "Dana"], ["Gray", "Baby"]]</span>
<span class="k">end</span>
</pre></div>
<p>As you can see, the <code>:dup</code> storage mode shuts off the normal value replacing behavior and instead inserts the duplicate value after what was already stored for that key.</p>
<p>Several methods in Oklahoma Mixer have been expanded to support these duplicate values. For example, with a B+Tree Database you can scope <code>values()</code> or <code>size()</code> to a specific key, retrieving just the <code>values()</code> stored under that key or getting a count of how many values there are for that key:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"names.tcb"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"Matsumoto"</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"Yukihiro"</span>
<span class="sx">%w[James Dana]</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="nb">name</span><span class="o">|</span>
<span class="n">db</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="s2">"Gray"</span><span class="p">,</span> <span class="nb">name</span><span class="p">,</span> <span class="ss">:dup</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">db</span><span class="o">.</span><span class="n">values</span> <span class="c1"># => ["James", "Dana", "Yukihiro"]</span>
<span class="n">db</span><span class="o">.</span><span class="n">values</span><span class="p">(</span><span class="s2">"Gray"</span><span class="p">)</span> <span class="c1"># => ["James", "Dana"]</span>
<span class="n">db</span><span class="o">.</span><span class="n">size</span> <span class="c1"># => 3</span>
<span class="n">db</span><span class="o">.</span><span class="n">size</span><span class="p">(</span><span class="s2">"Gray"</span><span class="p">)</span> <span class="c1"># => 2</span>
<span class="k">end</span>
</pre></div>
<p>You will need to use these methods to work with duplicate values because normal indexing, <code>fetch()</code>, and <code>delete()</code> still just work with the first value stored under a key. That behavior can be valuable too though:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"todo.tcb"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="k">if</span> <span class="n">db</span><span class="o">.</span><span class="n">size</span><span class="o">.</span><span class="n">zero?</span>
<span class="sx">%w[B+tree Fixed-length tuning]</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">topic</span><span class="o">|</span>
<span class="n">db</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="ss">:blog</span><span class="p">,</span> <span class="n">topic</span><span class="p">,</span> <span class="ss">:dup</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="nb">puts</span> <span class="s2">"Write about </span><span class="si">#{</span><span class="n">db</span><span class="o">.</span><span class="n">delete</span><span class="p">(</span><span class="ss">:blog</span><span class="p">)</span><span class="si">}</span><span class="s2">."</span>
<span class="k">end</span>
</pre></div>
<p>If I run that program three times, this is what you see:</p>
<pre><code>$ ruby tc_example.rb
Write about B+tree.
$ ruby tc_example.rb
Write about Fixed-length.
$ ruby tc_example.rb
Write about tuning.
</code></pre>
<p>See how <code>delete()</code> just kept pulling the first value that was left? That allowed us to use it as a simple queue in this case.</p>
<p>The <code>delete()</code> method can be passed the <code>:dup</code> storage mode as a second argument. When you do, all values under the passed key will be removed.</p>
<p>When working with duplicates, be aware that <code>keys()</code> and <code>each_key()</code> (or any iterator) behave differently. <code>keys()</code> returns a unique list, so keys with duplicate values under them will only be listed once. Iteration walks each pair in the database though, so a key will come up once for each value stored under it. Put another way, iteration does show duplicates while <code>keys()</code> won't.</p>
<p>Let me show one last, slightly bigger example to bring together all of the features discussed above. Here's a little more involved queuing system:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">GROUPS</span> <span class="o">=</span> <span class="p">{</span><span class="kp">nil</span> <span class="o">=></span> <span class="mi">0</span><span class="p">,</span> <span class="s2">"critical"</span> <span class="o">=></span> <span class="mi">1</span><span class="p">,</span> <span class="s2">"normal"</span> <span class="o">=></span> <span class="mi">2</span><span class="p">,</span> <span class="s2">"low"</span> <span class="o">=></span> <span class="mi">3</span><span class="p">}</span>
<span class="n">order</span> <span class="o">=</span> <span class="nb">lambda</span> <span class="p">{</span> <span class="o">|</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="o">|</span>
<span class="n">a_group</span><span class="p">,</span> <span class="n">a_priority</span> <span class="o">=</span> <span class="n">a</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s2">":"</span><span class="p">)</span>
<span class="n">b_group</span><span class="p">,</span> <span class="n">b_priority</span> <span class="o">=</span> <span class="n">b</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s2">":"</span><span class="p">)</span>
<span class="o">[</span><span class="no">GROUPS</span><span class="o">[</span><span class="n">a_group</span><span class="o">]</span><span class="p">,</span> <span class="o">-</span><span class="n">a_priority</span><span class="o">.</span><span class="n">to_i</span><span class="o">]</span> <span class="o"><=></span> <span class="o">[</span><span class="no">GROUPS</span><span class="o">[</span><span class="n">b_group</span><span class="o">]</span><span class="p">,</span> <span class="o">-</span><span class="n">b_priority</span><span class="o">.</span><span class="n">to_i</span><span class="o">]</span>
<span class="p">}</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"queue.tcb"</span><span class="p">,</span> <span class="ss">:cmpfunc</span> <span class="o">=></span> <span class="n">order</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="k">case</span> <span class="no">ARGV</span><span class="o">.</span><span class="n">shift</span>
<span class="k">when</span> <span class="s2">"add"</span>
<span class="n">group</span> <span class="o">=</span> <span class="s2">"normal"</span>
<span class="no">ARGV</span><span class="o">.</span><span class="n">delete_if</span> <span class="p">{</span> <span class="o">|</span><span class="n">o</span><span class="o">|</span> <span class="n">o</span> <span class="o">=~</span> <span class="sr">/\A--(critical|low)\z/</span> <span class="ow">and</span> <span class="n">group</span> <span class="o">=</span> <span class="vg">$1</span> <span class="p">}</span>
<span class="n">priority</span> <span class="o">=</span> <span class="mi">10</span>
<span class="no">ARGV</span><span class="o">.</span><span class="n">delete_if</span> <span class="p">{</span> <span class="o">|</span><span class="n">o</span><span class="o">|</span> <span class="n">o</span> <span class="o">=~</span> <span class="sr">/\A-(\d+)\z/</span> <span class="ow">and</span> <span class="n">priority</span> <span class="o">=</span> <span class="vg">$1</span><span class="o">.</span><span class="n">to_i</span> <span class="p">}</span>
<span class="n">db</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="s2">"</span><span class="si">#{</span><span class="n">group</span><span class="si">}</span><span class="s2">:</span><span class="si">#{</span><span class="n">priority</span><span class="si">}</span><span class="s2">"</span><span class="p">,</span> <span class="no">ARGV</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="s2">"; "</span><span class="p">),</span> <span class="ss">:dup</span><span class="p">)</span>
<span class="k">when</span> <span class="s2">"list"</span>
<span class="n">db</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">key</span><span class="p">,</span> <span class="n">value</span><span class="o">|</span>
<span class="nb">puts</span> <span class="n">key</span>
<span class="nb">puts</span> <span class="s2">" </span><span class="si">#{</span><span class="n">value</span><span class="si">}</span><span class="s2">"</span>
<span class="k">end</span>
<span class="k">when</span> <span class="s2">"do_one"</span>
<span class="k">if</span> <span class="n">key</span> <span class="o">=</span> <span class="n">db</span><span class="o">.</span><span class="n">keys</span><span class="p">(</span><span class="ss">:limit</span> <span class="o">=></span> <span class="mi">1</span><span class="p">)</span><span class="o">.</span><span class="n">first</span> <span class="ow">and</span> <span class="n">job</span> <span class="o">=</span> <span class="n">db</span><span class="o">.</span><span class="n">delete</span><span class="p">(</span><span class="n">key</span><span class="p">)</span>
<span class="nb">eval</span><span class="p">(</span><span class="n">job</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">when</span> <span class="s2">"do_all"</span>
<span class="kp">loop</span> <span class="k">do</span>
<span class="k">if</span> <span class="n">key</span> <span class="o">=</span> <span class="n">db</span><span class="o">.</span><span class="n">keys</span><span class="p">(</span><span class="ss">:limit</span> <span class="o">=></span> <span class="mi">1</span><span class="p">)</span><span class="o">.</span><span class="n">first</span> <span class="ow">and</span> <span class="n">job</span> <span class="o">=</span> <span class="n">db</span><span class="o">.</span><span class="n">delete</span><span class="p">(</span><span class="n">key</span><span class="p">)</span>
<span class="nb">eval</span><span class="p">(</span><span class="n">job</span><span class="p">)</span>
<span class="k">else</span>
<span class="k">break</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">else</span>
<span class="nb">abort</span> <span class="s2">"Usage: </span><span class="si">#{</span><span class="vg">$PROGRAM_NAME</span><span class="si">}</span><span class="s2"> add|list|do_one|do_all [OPTIONS]"</span>
<span class="k">end</span>
<span class="k">end</span>
</pre></div>
<p>Most of that should be pretty straightforward code after all we've talked about, but let me point out one tricky spot. I had to add the <code>nil => 0</code> entry to my <code>GROUPS</code> because fetching a full, or in this case <code>:limit</code>ed, set of <code>keys()</code> is really just a <code>:prefix</code> query with an empty <code>:prefix</code>. Because of that, you want to make sure your ordering functions always order an empty <code>String</code> key before anything else. Calling <code>split()</code> on the empty <code>String</code> gives me a <code>nil</code> group, which is converted to a <code>0</code> so it will come out first.</p>
<p>It's probably also worth pointing out that I could have just used <code>each()</code> with the <code>do_all</code> command. However, always fetching the first key and using that is a little better in a multiprocessing environment where other processes might be adding to the queue. If I'm iterating through the list, I won't see new <code>critical</code> jobs if they are added above where I am at. Using <code>keys()</code> though, I'll always grab the most important job next. This code isn't really built for multiprocessing to tell the truth, but let's save that discussion for a later article.</p>
<p>Anyway, here's an example of me playing around with the program above, so you can see how it works in practice:</p>
<pre><code>$ ruby queue.rb add 'puts "An average job."'
$ ruby queue.rb add --low 'puts "This can wait..."'
$ ruby queue.rb add --critical 'puts "Very important."'
$ ruby queue.rb add --critical -100 'puts "Most important!"'
$ ruby queue.rb listcritical:100
puts "Most important!"
critical:10
puts "Very important."
normal:10
puts "An average job."
low:10
puts "This can wait..."
$ ruby queue.rb do_one
Most important!
$ ruby queue.rb do_one
Very important.
$ ruby queue.rb do_all
An average job.
This can wait...
</code></pre>
<p>To summarize, the B+Tree Database gives you ordering, key ranges and cursor based iteration (the ability to skip to a specific key), and duplicate storage. You pay a speed penalty for these added features though. That's the tradeoff.</p>
<h4>The Fixed-length Database</h4>
<p>Another type of database supported by Tokyo Cabinet is the Fixed-length Database. It too is an extension of the Hash Database, supporting most of the features I showed you in that article. However, I'm not going to lie to you, this database type comes with three significant restrictions.</p>
<p>First, all keys are <code>Integer</code>s greater than <code>0</code>. You can't use arbitrary <code>String</code>s as you do with the Hash and B+Tree Databases. As such, you lose the ability to do <code>:prefix</code> queries on <code>keys()</code>. The database is ordered though, similar to the B+Tree Database. You can't change this ordering, but it is done numerically (instead of lexically) since all keys are just <code>Integer</code>s anyway. Given that, <code>:range</code> queries on <code>keys()</code> are supported. Methods like <code>keys()</code> and the iterators will pass you <code>Integer</code> keys in Ruby, instead of the <code>String</code> keys you get with the other database types.</p>
<p>The second downside is that all values stored have a <em>fixed-length</em>, which is what gives the database its name. This length defaults to <code>255</code>, but you can tune it to anything you like with the <code>:width</code> tuning parameter:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"four.tcf"</span><span class="p">,</span> <span class="ss">:width</span> <span class="o">=></span> <span class="mi">4</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">db</span><span class="o">.</span><span class="n">update</span><span class="p">(</span> <span class="mi">1</span> <span class="o">=></span> <span class="ss">:one</span><span class="p">,</span>
<span class="mi">2</span> <span class="o">=></span> <span class="ss">:two</span><span class="p">,</span>
<span class="mi">3</span> <span class="o">=></span> <span class="ss">:three</span><span class="p">,</span>
<span class="mi">4</span> <span class="o">=></span> <span class="ss">:four</span><span class="p">,</span>
<span class="mi">5</span> <span class="o">=></span> <span class="ss">:fix</span><span class="p">,</span>
<span class="mi">6</span> <span class="o">=></span> <span class="ss">:six</span><span class="p">,</span>
<span class="mi">7</span> <span class="o">=></span> <span class="ss">:seven</span><span class="p">,</span>
<span class="mi">8</span> <span class="o">=></span> <span class="ss">:eight</span> <span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">each_value</span> <span class="k">do</span> <span class="o">|</span><span class="n">num</span><span class="o">|</span>
<span class="nb">puts</span> <span class="n">num</span>
<span class="k">end</span>
<span class="c1"># >> one</span>
<span class="c1"># >> two</span>
<span class="c1"># >> thre</span>
<span class="c1"># >> four</span>
<span class="c1"># >> fix</span>
<span class="c1"># >> six</span>
<span class="c1"># >> seve</span>
<span class="c1"># >> eigh</span>
<span class="k">end</span>
</pre></div>
<p>Notice how everything beyond my selected <code>:width</code> of <code>4</code> was just silently discarded. That's the fixed-length at work.</p>
<p>Also note that you create a <b>T</b>okyo <b>C</b>abinet <b>F</b>ixed-length Database as you probably expect by now, with the file extension <code>.tcf</code>.</p>
<p>Finally, the Fixed-length Database has one last size limit. The overall file size of the database is limited to <code>268435456</code> bytes, by default. This too can be tuned using the <code>:limsiz</code> tuning parameter and you are free to make the limit quite large. Just remember that values are fixed length, so setting <code>:width => 1024, :limsiz => 4 * 1024</code> will mean your database only holds four keys. Trying to add data beyond this limit will raise an <code>OklahomaMixer::Error::CabinetError</code>.</p>
<p>That's a lot of minuses and you are probably wondering why anyone would be willing to accept all of these limits when we've already seen that there are more powerful options. The answer is performance. The Fixed-length Database is Tokyo Cabinet's fastest weapon. It treats the database file as a raw array of bytes and it can jump straight to any value with simple math. (Due to this, <code>defrag()</code> isn't supported on a Fixed-length Database, though Oklahoma Mixer does provide a no-op just to match the interface of the other database types.) That makes it wicked quick. If your data storage needs are simple enough to fit within these limitations, you can take advantage of this added speed boost.</p>
<p>The Fixed-length Database interface does have one other neat feature I should mention. It supports four special key names: <code>:min</code>, <code>:max</code>, <code>:prev</code>, and <code>:next</code>. You can use these values in many of the methods that take keys. For example:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"special.tcf"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">db</span><span class="o">.</span><span class="n">update</span><span class="p">(</span> <span class="mi">1</span> <span class="o">=></span> <span class="ss">:first</span><span class="p">,</span>
<span class="mi">2</span> <span class="o">=></span> <span class="ss">:middle</span><span class="p">,</span>
<span class="mi">42</span> <span class="o">=></span> <span class="ss">:last</span> <span class="p">)</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:min</span><span class="o">]</span> <span class="c1"># => "first"</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:max</span><span class="o">]</span> <span class="c1"># => "last"</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:next</span><span class="o">]</span> <span class="o">=</span> <span class="ss">:added</span>
<span class="n">db</span><span class="o">.</span><span class="n">keys</span> <span class="c1"># => [1, 2, 42, 43]</span>
<span class="n">db</span><span class="o">[</span><span class="mi">43</span><span class="o">]</span> <span class="c1"># => "added"</span>
<span class="k">end</span>
</pre></div>
<p>Be careful when using these. <code>:min</code> and <code>:max</code> will raise an <code>OklahomaMixer::Error::CabinetError</code> if there are no keys in the database. <code>:prev</code> (not shown above) is even pickier, requiring a <code>:min</code> key that is above <code>1</code>, so it can safely add below it without hitting <code>0</code>. I find <code>:next</code> pretty useful though, as it makes it possible to queue up values. Here's the simple queue example I showed in the B+Tree code rewritten to use a Fixed-length Database instead:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"todo.tcf"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="c1"># load the data</span>
<span class="sx">%w[B+tree Fixed-length tuning]</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">topic</span><span class="o">|</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:next</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"Write about </span><span class="si">#{</span><span class="n">topic</span><span class="si">}</span><span class="s2">."</span>
<span class="k">end</span>
<span class="c1"># read it back</span>
<span class="kp">loop</span> <span class="k">do</span>
<span class="k">begin</span>
<span class="nb">puts</span> <span class="n">db</span><span class="o">.</span><span class="n">delete</span><span class="p">(</span><span class="ss">:min</span><span class="p">)</span>
<span class="k">rescue</span> <span class="no">OklahomaMixer</span><span class="o">::</span><span class="no">Error</span><span class="o">::</span><span class="no">CabinetError</span> <span class="c1"># no keys for :min</span>
<span class="k">break</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="c1"># >> Write about B+tree.</span>
<span class="c1"># >> Write about Fixed-length.</span>
<span class="c1"># >> Write about tuning.</span>
<span class="k">end</span>
</pre></div>
<p>To summarize, the B+Tree Database may slow you down a little, but the Fixed-length speeds you up, as long as you can accept certain restrictions. Select the database type that best fits the needs of your data.</p>
<h3>Tuning Parameters</h3>
<p>We've seen how to set tuning parameters in the examples above and already learned what some do. I'm going to save some parameters for discussions in later articles, when we talk about their specific functions. For now though, here are most of the tuning parameters available to the three databases we've covered so far.</p>
<ul>
<li>
<code>:bnum</code> (for Hash or B+Tree—can be used with <code>optimize()</code>): Specifies the <b>num</b>ber of elements to use in the <b>b</b>ucket array. The default is <code>131071</code> for Hash Databases and <code>32749</code> for B+Tree Databases. The suggested size is from 0.5 to 4 times the total number of records stored for Hash Databases or 1 to 4 times the total for B+Tree Databases.</li>
<li>
<code>:apow</code> (for Hash or B+Tree—can be used with <code>optimize()</code>): Specifies to the size of record <b>a</b>lignment as a <b>pow</b>er of 2. The default is <code>4</code> for Hash Databases and <code>8</code> for B+Tree Databases, meaning <code>2 ** 4 = 16</code> and <code>2 ** 8 = 256</code> respectively.</li>
<li>
<code>:fpow</code> (for Hash or B+Tree—can be used with <code>optimize()</code>): Specifies the maximum number of elements in the <b>f</b>ree block pool as a <b>pow</b>er of 2. The default is <code>10</code>, meaning <code>2 ** 10 = 1024</code>.</li>
<li>
<code>:opts</code> (for Hash or B+Tree—can be used with <code>optimize()</code>): Specifies the <b>opt</b>ion<b>s</b> for the database in a <code>String</code> of recognized character codes. There are no options by default, but this is commonly set to <code>"ld"</code> or <code>"lb"</code> for bigger databases. The options are:
<ul>
<li>
<code>"l"</code> allows the database file to grow <b>l</b>arge (over 2 GB) by using a 64-bit bucket array.</li>
<li>
<code>"d"</code> compresses each record in a Hash Database or page in a B+Tree Database with <b>D</b>eflate compression.</li>
<li>
<code>"b"</code> compresses each record in a Hash Database or page in a B+Tree Database with <b>B</b>ZIP2 compression.</li>
<li>
<code>"t"</code> compresses each record in a Hash Database or page in a B+Tree Database with <b>T</b>CBS compression.</li>
</ul>
</li>
<li>
<code>:rcnum</code> (for Hash): Specifies the maximum <b>num</b>ber of <b>r</b>ecords to be <b>c</b>ached. It is <code>0</code> or disabled by default.</li>
<li>
<code>:xmsiz</code> (for Hash or B+Tree): Specifies the <b>siz</b>e of e<b>x</b>tra mapped <b>m</b>emory. The default is <code>67108864</code> for Hash Databases or <code>0</code> (disabled) for B+Tree Databases.</li>
<li>
<code>:dfunit</code> (for Hash or B+Tree): Specifies the auto <b>d</b>e<b>f</b>ragmentation <b>unit</b> step number. It is <code>0</code> or disabled by default.</li>
<li>
<code>:cmpfunc</code> (for B+tree): Specifies the <b>c</b>o<b>mp</b>arison <b>func</b>tion used to order B+Tree Databases. See the detailed examples above for an explanation.</li>
<li>
<code>:lmemb</code> (for B+Tree—can be used with <code>optimize()</code>): Specifies the number of <b>memb</b>ers in each <b>l</b>eaf page. The default is <code>128</code>.</li>
<li>
<code>:nmemb</code> (for B+Tree—can be used with <code>optimize()</code>): Specifies the number of <b>memb</b>ers in each <b>n</b>on-leaf page. The default is <code>256</code>.</li>
<li>
<code>:lcnum</code> (for B+tree): Specifies the maximum <b>num</b>ber of <b>l</b>eaf nodes to be <b>c</b>ached. The default is <code>1024</code>.</li>
<li>
<code>:ncnum</code> (for B+tree): Specifies the maximum <b>num</b>ber of <b>n</b>on-leaf nodes to be <b>c</b>ached. The default is <code>512</code>.</li>
<li>
<code>:width</code> (for Fixed-length—can be used with <code>optimize()</code>): Specifies the <b>width</b> of values in Fixed-length Databases. See the detailed examples above for an explanation.</li>
<li>
<code>:limsiz</code> (for Fixed-length—can be used with <code>optimize()</code>): Specifies the <b>lim</b>it on database file <b>siz</b>e in Fixed-length Databases. See the detailed examples above for an explanation.</li>
</ul><p>I apologize for keeping the cryptic names in Oklahoma Mixer, but I felt it was better to stick with what Tokyo Cabinet uses so users could read about them in documentation and other resources for that library. Tokyo Tyrant also uses these names to configure a database by command-line, so you will find them in several different contexts.</p>
<p>Database objects have an <code>optimize()</code> method that can be used to modify the tuning parameters of an <code>open()</code> database. The parameters that can be used as such are noted above. There are sometimes additional restrictions though. For example, the <code>:limsiz</code> of a Fixed-length Database usually has to be increased when changed through <code>optimize()</code>.</p>
<p>That covers the various key-value database types in Tokyo Cabinet. The fourth type is quite a bit different from what we've look at so far, so I'll do a full article on it next.</p>James Edward Gray IITokyo Cabinet as a Key-Value Storetag:graysoftinc.com,2010-01-01:/posts/922014-06-05T18:57:10ZThis article covers the usage of Tokyo Cabinet's Hash Database. It shows basic Hash-like storage as well as some special features provided by Tokyo Cabinet.<p>Like most key-value stores, Tokyo Cabinet has a very <code>Hash</code>-like interface from Ruby (assuming you use Oklahoma Mixer). You can almost think of a Tokyo Cabinet database as a <code>Hash</code> that just happens to be stored in a file instead of memory. The advantage of that is that your data doesn't have to fit into memory. Luckily, you don't have to pay a big speed penalty to get this disk-backed storage. Tokyo Cabinet is pretty darn fast.</p>
<h4>Getting and Setting Keys</h4>
<p>Let's have a look at the normal <code>Hash</code>-like methods as well as the file storage aspect:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -KU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"data.tch"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="k">if</span> <span class="n">db</span><span class="o">.</span><span class="n">size</span><span class="o">.</span><span class="n">zero?</span>
<span class="nb">puts</span> <span class="s2">"Loading the database. Rerun to read back the data."</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:one</span><span class="o">]</span> <span class="o">=</span> <span class="mi">1</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:two</span><span class="o">]</span> <span class="o">=</span> <span class="mi">2</span>
<span class="n">db</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="ss">:three</span> <span class="o">=></span> <span class="mi">3</span><span class="p">,</span> <span class="ss">:four</span> <span class="o">=></span> <span class="mi">4</span><span class="p">)</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"users:1"</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"James"</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"users:2"</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"Ruby"</span>
<span class="k">else</span>
<span class="nb">puts</span> <span class="s2">"Reading data."</span>
<span class="sx">%w[ db[:one]</span>
<span class="sx"> db["users:2"]</span>
<span class="sx"> -</span>
<span class="sx"> db.keys</span>
<span class="sx"> db.keys(:prefix\ =>\ "users:")</span>
<span class="sx"> db.keys(:limit\ =>\ 2)</span>
<span class="sx"> db.values</span>
<span class="sx"> -</span>
<span class="sx"> db.values_at(:one,\ :two) ]</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">command</span><span class="o">|</span>
<span class="nb">puts</span><span class="p">(</span><span class="n">command</span> <span class="o">==</span> <span class="s2">"-"</span> <span class="p">?</span> <span class="s2">""</span> <span class="p">:</span> <span class="s2">"</span><span class="si">#{</span><span class="n">command</span><span class="si">}</span><span class="s2"> = %p"</span> <span class="o">%</span> <span class="o">[</span><span class="nb">eval</span><span class="p">(</span><span class="n">command</span><span class="p">)</span><span class="o">]</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
</pre></div>
<p>If I run that code twice, I see:</p>
<pre><code>$ ruby tc_example.rb
Loading the database. Rerun to read back the data.
$ ruby tc_example.rb
Reading data.
db[:one] = "1"
db["users:2"] = "Ruby"
db.keys = ["one", "two", "three", "four", "users:1", "users:2"]
db.keys(:prefix => "users:") = ["users:1", "users:2"]
db.keys(:limit => 2) = ["one", "two"]
db.values = ["1", "2", "3", "4", "James", "Ruby"]
db.values_at(:one, :two) = ["1", "2"]
</code></pre>
<p>The file storage should be pretty obvious here. The first run of the program populated the data file and the second run read the data back. Obviously the data exists outside the process. It's actually stored in the file I named in my call to <code>open()</code>: <code>"data.tch"</code>. We will dig a lot more into the meaning of the file extensions later, but for now it's enough to know that <code>.tch</code> stands for <b>T</b>okyo <b>C</b>abinet <b>H</b>ash database. It's also worth pointing out that you don't have to pass a block to <code>open()</code>. When not passed a block <code>open()</code> will return the database reference and expect you to call <code>close()</code> manually when you are done, just as you could with any <code>IO</code> object from Ruby. Tokyo Cabinet can buffer output just like Ruby's <code>IO</code> streams can, so know that your data isn't guaranteed to have hit the disk until after a <code>close()</code>. You can <code>flush()</code> the data to disk before that though, if needed.</p>
<p>The getting and setting methods shouldn't be much of a surprise. I started off by using calling <code>size()</code> to count the pairs already in the database. I then used <code>[]=</code> to set a few keys. Note that I also used <code>update()</code> to add multiple keys at once. (The <code>merge()</code>/<code>merge!()</code> methods of <code>Hash</code> don't really make sense for the database so you do need to use the <code>update()</code> alias.) Later I read the data back with <code>[]</code>. It's all very <code>Hash</code>-like. I was even able to ask for all of the <code>keys()</code> as you can with a <code>Hash</code>, but the Oklahoma Mixer version of that method supports some extra filters like the <code>:prefix</code> and <code>:limit</code> shown above. There's also the matching <code>values()</code> call, though it doesn't have any filters. You can see that Oklahoma Mixer also allows us to fetch multiple keys at once with <code>values_at()</code>.</p>
<p>The last thing to get out of this example is the usual truth of key-value storage: keys and values are generally considered <code>String</code>s. Notice how <code>db[:one] = 1</code> actually stored a value of <code>"1"</code> under the key <code>"one"</code>. Make sure you remember to convert it back when you read it if you really need the number.</p>
<p>Another cool <code>Hash</code>-like feature you can make use of are defaults. You can set a static object to be used as the default value for keys not in the database or provide code to run to generate the default. Here is some code showing the possibilities in action:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -KU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"data.tch"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="c1"># no default set</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:missing</span><span class="o">]</span> <span class="c1"># => nil</span>
<span class="c1"># an Object default</span>
<span class="n">db</span><span class="o">.</span><span class="n">default</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:missing</span><span class="o">]</span> <span class="c1"># => 0</span>
<span class="k">end</span>
<span class="c1"># another way to set an Object default</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"data.tch"</span><span class="p">,</span> <span class="ss">:default</span> <span class="o">=></span> <span class="mi">42</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:missing</span><span class="o">]</span> <span class="c1"># => 42</span>
<span class="k">end</span>
<span class="c1"># a Proc default</span>
<span class="nb">proc</span> <span class="o">=</span> <span class="nb">lambda</span> <span class="p">{</span> <span class="o">|</span><span class="n">key</span><span class="o">|</span>
<span class="n">type</span><span class="p">,</span> <span class="nb">id</span> <span class="o">=</span> <span class="n">key</span><span class="o">.</span><span class="n">to_s</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s2">":"</span><span class="p">)</span>
<span class="s2">"New </span><span class="si">#{</span><span class="n">type</span><span class="si">}</span><span class="s2"> with id </span><span class="si">#{</span><span class="nb">id</span><span class="si">}</span><span class="s2">"</span>
<span class="p">}</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"data.tch"</span><span class="p">,</span> <span class="ss">:default</span> <span class="o">=></span> <span class="nb">proc</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"user:3"</span><span class="o">]</span> <span class="c1"># => nil</span>
<span class="k">end</span>
</pre></div>
<p><code>Proc</code> defaults are always executed, so if you want a default that returns a <code>Proc</code>, just pass a <code>Proc</code> that creates the desired <code>Proc</code>. All other objects are returned when indexing a missing value.</p>
<p>The important thing to remember about the defaults is that they are not stored in the file. They are just a convenience from the Ruby interface and you will need to set them again anytime you make a new connection to the database.</p>
<p>You can also walk the key-value pairs of a Tokyo Cabinet database using the standard iterators you expect in Ruby:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -KU</span>
<span class="nb">require</span> <span class="s2">"pp"</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"data.tch"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="c1"># a Hash-like each()</span>
<span class="n">db</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">key</span><span class="p">,</span> <span class="n">value</span><span class="o">|</span>
<span class="nb">puts</span> <span class="s2">"db[%p] = %p"</span> <span class="o">%</span> <span class="o">[</span><span class="n">key</span><span class="p">,</span> <span class="n">value</span><span class="o">]</span>
<span class="k">end</span>
<span class="c1"># other iterators from Enumerable are supported</span>
<span class="nb">puts</span>
<span class="n">pp</span> <span class="n">db</span><span class="o">.</span><span class="n">select</span> <span class="p">{</span> <span class="o">|</span><span class="n">key</span><span class="p">,</span> <span class="n">_</span><span class="o">|</span> <span class="n">key</span> <span class="o">=~</span> <span class="sr">/\Ausers:/</span> <span class="p">}</span>
<span class="n">pp</span> <span class="n">db</span><span class="o">.</span><span class="n">find</span> <span class="p">{</span> <span class="o">|</span><span class="n">_</span><span class="p">,</span> <span class="n">value</span><span class="o">|</span> <span class="n">value</span> <span class="o">=~</span> <span class="sr">/\A\D/</span> <span class="p">}</span>
<span class="k">end</span>
</pre></div>
<p>Running that gives us:</p>
<div class="highlight highlight-ruby"><pre><span class="n">db</span><span class="o">[</span><span class="s2">"one"</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"1"</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"two"</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"2"</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"three"</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"3"</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"four"</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"4"</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"users:1"</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"James"</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"users:2"</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"Ruby"</span>
<span class="o">[[</span><span class="s2">"users:1"</span><span class="p">,</span> <span class="s2">"James"</span><span class="o">]</span><span class="p">,</span> <span class="o">[</span><span class="s2">"users:2"</span><span class="p">,</span> <span class="s2">"Ruby"</span><span class="o">]]</span>
<span class="o">[</span><span class="s2">"users:1"</span><span class="p">,</span> <span class="s2">"James"</span><span class="o">]</span>
</pre></div>
<p>You can see that we get an <code>each()</code> that walks key-value pairs, just as a <code>Hash</code> would. We also get all of the other standard <code>Enumerable</code> iterators. This gives us several different ways to comb the data for specific keys.</p>
<p>When you are done playing around with data, you have multiple options for getting rid of it. You can just <code>clear()</code> all keys if you are sure that's safe. Of course, just deleting the file has pretty much the same effect. If you need to selectively remove data, you can <code>delete()</code> a single key-value pair or use the <code>delete_if()</code> iterator to programmatically remove pairs.</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -KU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"data.tch"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">db</span><span class="o">.</span><span class="n">delete</span><span class="p">(</span><span class="ss">:one</span><span class="p">)</span> <span class="c1"># => "1"</span>
<span class="n">db</span><span class="o">.</span><span class="n">delete_if</span> <span class="p">{</span> <span class="o">|</span><span class="n">key</span><span class="p">,</span> <span class="n">_</span><span class="o">|</span> <span class="n">key</span> <span class="o">=~</span> <span class="sr">/\Ausers:/</span> <span class="p">}</span>
<span class="n">db</span><span class="o">.</span><span class="n">keys</span> <span class="c1"># => ["two", "three", "four"]</span>
<span class="n">db</span><span class="o">.</span><span class="n">clear</span>
<span class="n">db</span><span class="o">.</span><span class="n">keys</span> <span class="c1"># => []</span>
<span class="k">end</span>
</pre></div>
<p>The <code>delete()</code> method does return the value for the removed key, or <code>nil</code> if that key wasn't in the database. That feature isn't really safe if multiple processes are manipulating the data at once, unless you take the right precautions. We will talk a lot more about that later though.</p>
<p>That covers the basic <code>Hash</code> style interface to Tokyo Cabinet. Let's move into some other aspects of the library now.</p>
<h4>Counters and Appended Values</h4>
<p>We've already seen the standard <code>Hash</code>-like method of storing data with <code>db[:key] = :value</code>. The less common <code>store()</code> method from <code>Hash</code> is also supported (as is <code>fetch()</code> for retrieving values), so you can do things like <code>db.store(:key, :value)</code>. The advantage of using <code>store()</code> is that it supports modes. You can use these modes to manipulate values in different ways. Let's look at some of the options.</p>
<p>Most key-values stores provide an action for atomically incrementing a counter and Tokyo Cabinet is no exception. This is important because it allows you to track unique ID's. Have a look at the various ways you can use the <code>store()</code> method to manage counters with the <code>:add</code> mode:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -KU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"data.tch"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"globals:user_id"</span><span class="o">]</span> <span class="c1"># => nil</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"globals:float"</span><span class="o">]</span> <span class="c1"># => nil</span>
<span class="n">db</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="s2">"globals:user_id"</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="ss">:add</span><span class="p">)</span> <span class="c1"># => 1</span>
<span class="n">db</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="s2">"globals:user_id"</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="ss">:add</span><span class="p">)</span> <span class="c1"># => 2</span>
<span class="n">db</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="s2">"globals:float"</span><span class="p">,</span> <span class="mi">2</span><span class="o">.</span><span class="mi">1</span><span class="p">,</span> <span class="ss">:add</span><span class="p">)</span> <span class="c1"># => 2.1</span>
<span class="n">db</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="s2">"globals:user_id"</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="ss">:add</span><span class="p">)</span> <span class="c1"># => 1</span>
<span class="k">end</span>
</pre></div>
<p>While all of that should be pretty obvious, this mode has a few gotchas you want to stay aware of. It's OK to start <code>:add</code>ing to a <code>nil</code> field as I've shown above, but don't try to use a field already set to a non-<code>:add</code>ed value or you will likely get a <code>OklahomaMixer::Error::CabinetError</code>. This is true even if you have what you think is a number in the value. Tokyo Cabinet's numbers are a C-ish chunk of bytes so it won't recognize digits in <code>String</code> form. This also means you don't generally want to read an <code>:add</code>ed value with a normal call to <code>[]</code>. It probably won't look like anything you are expecting. Tokyo Cabinet also uses different formats for <code>Integer</code> and <code>Float</code> values, so you will get the same error if you try to switch. Always add the same type of number to a given field.</p>
<p>Another unusual type of value management can be done in Tokyo Cabinet by appending to a value with <code>:cat</code> mode. For example:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -KU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"data.tch"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">db</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="ss">:friend_ids</span><span class="p">,</span> <span class="s2">" 1"</span><span class="p">,</span> <span class="ss">:cat</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="ss">:friend_ids</span><span class="p">,</span> <span class="s2">" 3"</span><span class="p">,</span> <span class="ss">:cat</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="ss">:friend_ids</span><span class="p">,</span> <span class="s2">" 5"</span><span class="p">,</span> <span class="ss">:cat</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="ss">:friend_ids</span><span class="p">,</span> <span class="s2">" 3"</span><span class="p">,</span> <span class="ss">:cat</span><span class="p">)</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:friend_ids</span><span class="o">]</span> <span class="c1"># => " 1 3 5 3"</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:friend_ids</span><span class="o">].</span><span class="n">to_s</span><span class="o">.</span><span class="n">scan</span><span class="p">(</span><span class="sr">/\S+/</span><span class="p">)</span><span class="o">.</span><span class="n">uniq</span> <span class="c1"># => ["1", "3", "5"]</span>
<span class="k">end</span>
</pre></div>
<p>As you can see, this method will create a value if it didn't exist and then continue appending to the value after it does. If you need the opposite behavior, to avoid messing with a key that already exists, try <code>:keep</code> mode instead:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -KU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"data.tch"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:exists</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"Can't touch this!"</span>
<span class="n">db</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="ss">:exists</span><span class="p">,</span> <span class="s2">"Lost."</span><span class="p">,</span> <span class="ss">:keep</span><span class="p">)</span> <span class="c1"># => false</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:exists</span><span class="o">]</span> <span class="c1"># => "Can't touch this!"</span>
<span class="k">end</span>
</pre></div>
<p>Similarly, you can just pass a block to <code>store()</code> that will be called if a key already exists. That block is expected to return the value that should be saved to the database:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -KU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"data.tch"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">adder</span> <span class="o">=</span> <span class="nb">lambda</span> <span class="p">{</span> <span class="o">|</span><span class="n">key</span><span class="p">,</span> <span class="n">old_value</span><span class="p">,</span> <span class="n">new_value</span><span class="o">|</span> <span class="n">old_value</span><span class="o">.</span><span class="n">to_i</span> <span class="o">+</span> <span class="n">new_value</span> <span class="p">}</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:num</span><span class="o">]</span> <span class="c1"># => nil</span>
<span class="n">db</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="ss">:num</span><span class="p">,</span> <span class="mi">41</span><span class="p">,</span> <span class="o">&</span><span class="n">adder</span><span class="p">)</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:num</span><span class="o">]</span> <span class="c1"># => "41"</span>
<span class="n">db</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="ss">:num</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="o">&</span><span class="n">adder</span><span class="p">)</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:num</span><span class="o">]</span> <span class="c1"># => "42"</span>
<span class="k">end</span>
</pre></div>
<p>These modes give you some powerful ways to build up values over time, even with different processes working on the same data. Their effects are atomic and that's important in any multiprocessing environment.</p>
<h4>Transactions</h4>
<p>Transactions are a big part of what makes Tokyo Cabinet great to work with. With them you can define a set of actions that must succeed or fail as a whole. Let's start by considering this from the classical transferring money between accounts example:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -KU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"data.tch"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"accounts:1:balance"</span><span class="o">]</span> <span class="o">=</span> <span class="mi">100</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"accounts:2:balance"</span><span class="o">]</span> <span class="o">=</span> <span class="mi">100</span>
<span class="n">db</span><span class="o">.</span><span class="n">transaction</span> <span class="k">do</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"accounts:1:balance"</span><span class="o">]</span> <span class="o">=</span> <span class="n">db</span><span class="o">[</span><span class="s2">"accounts:1:balance"</span><span class="o">].</span><span class="n">to_i</span> <span class="o">-</span> <span class="mi">10</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"accounts:2:balance"</span><span class="o">]</span> <span class="o">=</span> <span class="n">db</span><span class="o">[</span><span class="s2">"accounts:2:balance"</span><span class="o">].</span><span class="n">to_i</span> <span class="o">+</span> <span class="mi">10</span>
<span class="k">end</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"accounts:1:balance"</span><span class="o">]</span> <span class="c1"># => "90"</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"accounts:2:balance"</span><span class="o">]</span> <span class="c1"># => "110"</span>
<span class="k">end</span>
</pre></div>
<p>That code should be easy to understand. I just removed an amount from one account and added that same amount to the other. I've done this transfer inside of a <code>transaction()</code>, but it doesn't really have any effect when things go right as they did here. Let's break something and see what happens:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -KU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"data.tch"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"accounts:1:balance"</span><span class="o">]</span> <span class="o">=</span> <span class="mi">100</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"accounts:2:balance"</span><span class="o">]</span> <span class="o">=</span> <span class="mi">100</span>
<span class="k">begin</span>
<span class="n">db</span><span class="o">.</span><span class="n">transaction</span> <span class="k">do</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"accounts:1:balance"</span><span class="o">]</span> <span class="o">=</span> <span class="n">db</span><span class="o">[</span><span class="s2">"accounts:1:balance"</span><span class="o">].</span><span class="n">to_i</span> <span class="o">-</span> <span class="mi">10</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"accounts:2:balance"</span><span class="o">]</span> <span class="o">=</span> <span class="n">db</span><span class="o">[</span><span class="s2">"accounts:2:balance"</span><span class="o">].</span><span class="n">to_i</span> <span class="o">+</span> <span class="mi">10</span>
<span class="nb">fail</span> <span class="s2">"Oops!"</span>
<span class="k">end</span>
<span class="k">rescue</span>
<span class="c1"># do nothing: just continue on to checking the balances</span>
<span class="k">end</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"accounts:1:balance"</span><span class="o">]</span> <span class="c1"># => "100"</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"accounts:2:balance"</span><span class="o">]</span> <span class="c1"># => "100"</span>
<span class="k">end</span>
</pre></div>
<p>This time we see the difference. Both of my actions against the database had already been processed. However, my <code>fail()</code> call was part of the same <code>transaction()</code> and the <code>Exception</code> meant everything had to be undone. Notice that the account balances were restored to their previous values.</p>
<p>It is possible for you to cancel a <code>transaction()</code> without triggering an <code>Exception</code>. That's what the <code>abort()</code> method is for:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -KU</span>
<span class="nb">require</span> <span class="s2">"oklahoma_mixer"</span>
<span class="no">OklahomaMixer</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"data.tch"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">db</span><span class="o">|</span>
<span class="n">db</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="s2">"globals:user_id"</span><span class="p">,</span> <span class="mi">41</span><span class="p">,</span> <span class="ss">:add</span><span class="p">)</span> <span class="c1"># pretend we have a few users</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"users:42:last_name"</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"Nobody"</span> <span class="c1"># and some bad data</span>
<span class="n">user</span> <span class="o">=</span> <span class="p">{</span><span class="ss">:first_name</span> <span class="o">=></span> <span class="s2">"James"</span><span class="p">,</span> <span class="ss">:last_name</span> <span class="o">=></span> <span class="s2">"Gray"</span><span class="p">}</span>
<span class="n">db</span><span class="o">.</span><span class="n">transaction</span> <span class="k">do</span>
<span class="n">user</span><span class="o">[</span><span class="ss">:id</span><span class="o">]</span> <span class="o">=</span> <span class="n">db</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="s2">"globals:user_id"</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="ss">:add</span><span class="p">)</span>
<span class="k">if</span> <span class="n">user</span><span class="o">.</span><span class="n">all?</span> <span class="p">{</span> <span class="o">|</span><span class="n">k</span><span class="p">,</span> <span class="n">v</span><span class="o">|</span> <span class="n">db</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="s2">"users:</span><span class="si">#{</span><span class="n">user</span><span class="o">[</span><span class="ss">:id</span><span class="o">]</span><span class="si">}</span><span class="s2">:</span><span class="si">#{</span><span class="n">k</span><span class="si">}</span><span class="s2">"</span><span class="p">,</span> <span class="n">v</span><span class="p">,</span> <span class="ss">:keep</span><span class="p">)</span> <span class="p">}</span>
<span class="n">user</span><span class="o">[</span><span class="ss">:saved</span><span class="o">]</span> <span class="o">=</span> <span class="kp">true</span>
<span class="k">else</span>
<span class="n">db</span><span class="o">.</span><span class="n">abort</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">unless</span> <span class="n">user</span><span class="o">[</span><span class="ss">:saved</span><span class="o">]</span>
<span class="nb">puts</span> <span class="s2">"Unable to save user. Problem field(s):"</span>
<span class="n">user</span><span class="o">.</span><span class="n">each_key</span> <span class="k">do</span> <span class="o">|</span><span class="n">key</span><span class="o">|</span>
<span class="k">if</span> <span class="n">value</span> <span class="o">=</span> <span class="n">db</span><span class="o">[</span><span class="s2">"users:</span><span class="si">#{</span><span class="n">user</span><span class="o">[</span><span class="ss">:id</span><span class="o">]</span><span class="si">}</span><span class="s2">:</span><span class="si">#{</span><span class="n">key</span><span class="si">}</span><span class="s2">"</span><span class="o">]</span>
<span class="nb">puts</span> <span class="sx">%Q{db["users:</span><span class="si">#{</span><span class="n">user</span><span class="o">[</span><span class="ss">:id</span><span class="o">]</span><span class="si">}</span><span class="sx">:</span><span class="si">#{</span><span class="n">key</span><span class="si">}</span><span class="sx">"] = </span><span class="si">#{</span><span class="n">value</span><span class="o">.</span><span class="n">inspect</span><span class="si">}</span><span class="sx">}</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="c1"># >> Unable to save user. Problem field(s):</span>
<span class="c1"># >> db["users:42:last_name"] = "Nobody"</span>
<span class="k">end</span>
<span class="k">end</span>
</pre></div>
<p>As you can see, <code>abort()</code> didn't toss an <code>Exception</code> but it rolled back my <code>transaction()</code> all the same. None of the new user fields were added to the database because they couldn't all be safely added. I knew that because one of the <code>:keep</code> mode calls to <code>store()</code> returned <code>false</code> when it tried to set an already existing key.</p>
<p>That's the magic of transactions. They are an all-or-nothing thing. Only if your block completes with no <code>Exception</code> thrown and no call to <code>abort()</code> will all of the changes be made.</p>
<h4>Database File Maintenance</h4>
<p>There are a lot of advantages that come with a database that's just one file in the file system. You can build symlinks to it, set permissions on it, and check its size with the normal tools your OS provides (though Oklahoma Mixer does have a <code>file_size()</code> method that returns the file size in bytes, if you need it). Of course, there are also tradeoffs you should stay aware of.</p>
<p>First, The file can get a little bloated over time. The reason is normal fragmentation: Tokyo Cabinet may clear a key freeing up some space and later fill it with a not-quite-as-big item. It may not find a good use for the even smaller remaining space for a long time. This creates small pockets of unused space that grow the file over time.</p>
<p>The easiest way to deal with this is to call <code>defrag()</code> periodically at a slow time. This will lock up the database for a few seconds while Tokyo Cabinet cleans it up. This will take care of the wasted space and shrink the file size back down (assuming it was fragmented).</p>
<p>Another issue to stay aware of is how you make backup copies of the database file. You need to be careful about using standard tools like <code>cp</code> or <code>rsync</code> on a Tokyo Cabinet file. It's fine if you know all connections to it are currently closed, but it's not safe when a connection might be changing the data inside of it mid-copy. If you try that, you will likely get a corrupt copy of the database.</p>
<p>The solution is to call <code>copy()</code> and pass in the path where you would like to create a copy of the database. It will synchronize the data, lock out changes, and then make a full duplicate. This process is quite snappy, even with bigger data sets. If desired, you can ask Oklahoma Mixer for the <code>path()</code> of the original database, edit it in some small way, and use that as the path for the duplicate database.</p>
<p>Just make sure you keep these issues in mind as you plan out your storage.</p>
<p>Those are the basics of using Tokyo Cabinet as a key-value store, but there's really a lot more to what Tokyo Cabinet can do. I'll show you what all is built onto this simple foundation in upcoming articles.</p>James Edward Gray IIWhere Redis is a Good Fittag:graysoftinc.com,2009-09-17:/posts/912014-04-18T22:46:41ZA brief discussion of the kinds of problems Redis makes easier.<p>Like any system, Redis has strengths and weaknesses. Some of the biggest positives with Redis are:</p>
<ul>
<li>It's wicked fast. In fact, it may just be the fastest key-value store.</li>
<li>The collection types and the atomic operations that work on them allow you to model some moderately complex data scenarios. This makes Redis fit some higher order problems where a simple key-value store wouldn't quite be enough.</li>
<li>The snapshot data dumping model can be an asset. You get persistence with Redis, but you pay a minimal penalty for it.</li>
</ul><p>Of course, there are always some minuses. These are the two I consider the most important:</p>
<ul>
<li>Redis is an in-memory data store, first and foremost. That means your entire dataset must fit completely in RAM and leave enough breathing room for anything else the server must do.</li>
<li>Snapshot backups are not perfect. If Redis fails between snapshots, you can lose data. You need to make sure that's acceptable for any application you use it in.</li>
</ul><p>It may seem weird to call snapshots both a pro and a con, but it does work for you in some ways and against you in others. You have to decide where the trade-off is worth it.</p>
<p>Given the above breakdown, I will list three places where I think Redis can be the right tool for the job. This is not meant to be an exhaustive list. I'm sure there are many other places Redis could be well used. Instead, it's better to look at why I've chosen these areas and try Redis with problems that have similar needs.</p>
<ul>
<li>
<strong>Redis makes an excellent cache or, more specifically, a memcached replacement.</strong> The reasoning for this is simple: you pretty much get memcached's features, plus lists and sets. You can use those collections to answer some queries, saving you trips to more expensive databases. That means you can use your cache more and increase the benefits of having it. You also get some persistence, which isn't critical in this scenario but is a nice value add.</li>
<li>
<strong>Redis is an ideal realtime statistics tracker.</strong> If you are tracking stats in realtime, there are three things you really care about: speed, speed, and speed. Some nice atomic operations don't hurt either. From simple counters, to audit logs, to sets of unique IP addresses, and much more, Redis really rocks this kind of problem domain.</li>
<li>
<strong>Redis can be the primary database for some Web applications.</strong> This one isn't as much of a given as the other two, obviously, and you can see that I've switched to using words like <em>can</em> and <em>some</em>. However, when your database needs are simple, Redis may be enough tool for the job.<br><br>
If you are going to try Redis in this role, you need to stay very aware of its minuses. For example, will it be OK if you lose some keys here and there? As you consider that though, do remember that you can force a save when needed. Perhaps it would be enough to force saves after a new user account is created and play things a little looser with the rest of the data, for example. It's also worth noting that Redis supports master-slave replication which can help reduce this risk.<br><br>
When considering if the entire database can be in memory at once, consider how fast the data will grow as well. Are you going to have time to monitor usage and react long before you run into a nasty limit?<br><br>
That said, there are plenty of applications that fit into those criteria. Take this blog for example. I'm sure the contents of it fit into a reasonable chunk of memory, it changes so little I could afford a hard save after every single write, and the rest of the time Redis would pay me back with crazy great reading speed.<br><br>
There's even a nice object mapping library for Redis: <a href="http://ohm.keyvalue.org/">Ohm</a>. I do encourage you to play with Redis at a lower level before resorting to such shortcuts though.</li>
</ul><p>Hopefully this series has given you some ideas of how you might use Redis. It's not right for every application, but it can really be a big win where it fits. You should now know enough to watch for such opportunities and take advantage of them.</p>James Edward Gray IILists and Sets in Redistag:graysoftinc.com,2009-09-16:/posts/902014-04-18T22:28:37ZPeeking behind the curtain of the simple key-value store facade to get a good look at the two features that make Redis unique and powerful.<p><em>[<strong>Update</strong>: though all of the techniques I show here still apply, many methods of the Redis gem have changed names to match the actual <a href="http://redis.io/commands">Redis commands</a> they call. There are also easier and more powerful ways to do some of what I show in here, thanks to additions to Redis.]</em></p>
<p>Redis adds one huge twist to traditional key-value storage: collections. Supporting both <em>lists</em> and <em>sets</em> through some very powerful atomic operations allows for advanced key-value usage.</p>
<h4>Lists</h4>
<p>Redis allows a single key to hold a list of values. This is your typical ordered list with the operations you would expect: appending, indexed access, and access to a range of values.</p>
<p>This has many potential uses. I'll cover two that I think will be very common. First, if you are going to use Redis as a full database, you store things that are naturally a list of items, like comments, in a real list. Let's look at some code:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="no">CLEAR</span> <span class="o">=</span> <span class="sb">`clear`</span>
<span class="c1"># create an article to comment on</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="n">article_id</span> <span class="o">=</span> <span class="n">db</span><span class="o">.</span><span class="n">incr</span><span class="p">(</span><span class="s2">"global:next_article_id"</span><span class="p">)</span>
<span class="n">article</span> <span class="o">=</span> <span class="s2">"article:</span><span class="si">#{</span><span class="n">article_id</span><span class="si">}</span><span class="s2">"</span>
<span class="k">class</span> <span class="o"><<</span> <span class="n">article</span>
<span class="k">def</span> <span class="nf">method_missing</span><span class="p">(</span><span class="n">field</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">&</span><span class="n">blk</span><span class="p">)</span>
<span class="k">return</span> <span class="k">super</span> <span class="k">unless</span> <span class="n">field</span><span class="o">.</span><span class="n">to_s</span> <span class="o">!~</span> <span class="sr">/[!?=]\z/</span> <span class="o">&&</span> <span class="n">args</span><span class="o">.</span><span class="n">empty?</span> <span class="o">&&</span> <span class="n">blk</span><span class="o">.</span><span class="n">nil?</span>
<span class="s2">"</span><span class="si">#{</span><span class="nb">self</span><span class="si">}</span><span class="s2">:</span><span class="si">#{</span><span class="n">field</span><span class="si">}</span><span class="s2">"</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="n">db</span><span class="o">[</span><span class="n">article</span><span class="o">.</span><span class="n">title</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"My Favorite Language"</span>
<span class="n">db</span><span class="o">[</span><span class="n">article</span><span class="o">.</span><span class="n">body</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"I love Ruby!"</span>
<span class="c1"># initialize some session details</span>
<span class="n">comments_per_page</span> <span class="o">=</span> <span class="mi">2</span>
<span class="n">comment_page</span> <span class="o">=</span> <span class="mi">1</span>
<span class="n">login</span> <span class="o">=</span> <span class="no">ARGV</span><span class="o">.</span><span class="n">shift</span> <span class="o">||</span> <span class="s2">"JEG2"</span>
<span class="kp">loop</span> <span class="k">do</span>
<span class="c1"># show article</span>
<span class="nb">print</span> <span class="no">CLEAR</span>
<span class="nb">puts</span> <span class="s2">"</span><span class="si">#{</span><span class="n">db</span><span class="o">[</span><span class="n">article</span><span class="o">.</span><span class="n">title</span><span class="o">]</span><span class="si">}</span><span class="s2">:"</span>
<span class="nb">puts</span> <span class="s2">" </span><span class="si">#{</span><span class="n">db</span><span class="o">[</span><span class="n">article</span><span class="o">.</span><span class="n">body</span><span class="o">]</span><span class="si">}</span><span class="s2">"</span>
<span class="c1"># paginate comments</span>
<span class="n">start</span> <span class="o">=</span> <span class="n">comments_per_page</span> <span class="o">*</span> <span class="p">(</span><span class="n">comment_page</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span>
<span class="n">finish</span> <span class="o">=</span> <span class="n">start</span> <span class="o">+</span> <span class="n">comments_per_page</span> <span class="o">-</span> <span class="mi">1</span>
<span class="n">comments</span> <span class="o">=</span> <span class="n">db</span><span class="o">.</span><span class="n">list_range</span><span class="p">(</span><span class="n">article</span><span class="o">.</span><span class="n">comments</span><span class="p">,</span> <span class="n">start</span><span class="p">,</span> <span class="n">finish</span><span class="p">)</span>
<span class="n">pagination</span> <span class="o">=</span> <span class="nb">Array</span><span class="p">(</span><span class="n">start</span><span class="o">.</span><span class="n">zero?</span> <span class="p">?</span> <span class="kp">nil</span> <span class="p">:</span> <span class="s2">"(p)revious"</span><span class="p">)</span>
<span class="n">pagination</span> <span class="o"><<</span> <span class="s2">"(n)ext"</span> <span class="k">if</span> <span class="n">db</span><span class="o">.</span><span class="n">list_length</span><span class="p">(</span><span class="n">article</span><span class="o">.</span><span class="n">comments</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span> <span class="o">></span> <span class="n">finish</span>
<span class="c1"># show comments</span>
<span class="n">comments</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">comment</span><span class="o">|</span>
<span class="n">posted</span><span class="p">,</span> <span class="n">user</span><span class="p">,</span> <span class="n">body</span> <span class="o">=</span> <span class="n">comment</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s2">"|"</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span>
<span class="nb">puts</span> <span class="s2">"----"</span>
<span class="nb">puts</span> <span class="s2">" </span><span class="si">#{</span><span class="n">body</span><span class="si">}</span><span class="s2">"</span>
<span class="nb">puts</span> <span class="s2">" posted by </span><span class="si">#{</span><span class="n">user</span><span class="si">}</span><span class="s2"> on </span><span class="si">#{</span><span class="n">posted</span><span class="si">}</span><span class="s2">"</span>
<span class="k">end</span>
<span class="c1"># handle commands</span>
<span class="nb">puts</span>
<span class="nb">print</span> <span class="s2">"Command? [</span><span class="si">#{</span><span class="p">(</span><span class="sx">%w[(c)omment (q)uit]</span> <span class="o">+</span> <span class="n">pagination</span><span class="p">)</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="s1">', '</span><span class="p">)</span><span class="si">}</span><span class="s2">] "</span>
<span class="k">case</span> <span class="p">(</span><span class="n">command</span> <span class="o">=</span> <span class="nb">gets</span><span class="p">)</span>
<span class="k">when</span> <span class="sr">/\Ac(?:omment)?\Z/i</span> <span class="c1"># add a comment</span>
<span class="nb">print</span> <span class="s2">"Your comment? "</span>
<span class="n">comment</span> <span class="o">=</span> <span class="nb">gets</span> <span class="ow">or</span> <span class="k">break</span>
<span class="n">posted</span> <span class="o">=</span> <span class="no">Time</span><span class="o">.</span><span class="n">now</span><span class="o">.</span><span class="n">strftime</span><span class="p">(</span><span class="s1">'%m/%d/%Y at %H:%M:%S'</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">push_tail</span><span class="p">(</span><span class="n">article</span><span class="o">.</span><span class="n">comments</span><span class="p">,</span> <span class="s2">"</span><span class="si">#{</span><span class="n">posted</span><span class="si">}</span><span class="s2">|</span><span class="si">#{</span><span class="n">login</span><span class="si">}</span><span class="s2">|</span><span class="si">#{</span><span class="n">comment</span><span class="o">.</span><span class="n">strip</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span>
<span class="k">when</span> <span class="sr">/\Ap(?:revious)?\Z/i</span> <span class="c1"># view previous page of comments</span>
<span class="k">if</span> <span class="n">pagination</span><span class="o">.</span><span class="n">first</span> <span class="o">=~</span> <span class="sr">/\A\(p\)/</span>
<span class="n">comment_page</span> <span class="o">-=</span> <span class="mi">1</span>
<span class="k">else</span>
<span class="nb">puts</span> <span class="s2">"You are on the first page of comments."</span>
<span class="nb">gets</span> <span class="ow">or</span> <span class="k">break</span>
<span class="k">end</span>
<span class="k">when</span> <span class="sr">/\An(?:ext)?\Z/i</span> <span class="c1"># view next page of comments</span>
<span class="k">if</span> <span class="n">pagination</span><span class="o">.</span><span class="n">last</span> <span class="o">=~</span> <span class="sr">/\A\(n\)/</span>
<span class="n">comment_page</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">else</span>
<span class="nb">puts</span> <span class="s2">"You are on the last page of comments."</span>
<span class="nb">gets</span> <span class="ow">or</span> <span class="k">break</span>
<span class="k">end</span>
<span class="k">when</span> <span class="sr">/\Aq(?:uit)?\Z/i</span><span class="p">,</span> <span class="kp">nil</span> <span class="c1"># exit program</span>
<span class="k">break</span>
<span class="k">end</span>
<span class="k">end</span>
</pre></div>
<p>I know that looks like a lot of code, but it's mostly interface. It also shows much of the common list interactions in just three methods.</p>
<p>You can see that adding to the list was a simple matter of calling <code>push_tail()</code>. Similar to how <code>incr()</code>/<code>decr()</code> initialize counters to <code>0</code>, list actions will default to an empty list if the key is undefined when they are triggered. You will get an error if you use the operations on keys that have already been set to non-list values though, so be careful with that.</p>
<p>When I was ready to read the list back, I paginated through the results using <code>list_length()</code> and <code>list_range()</code>. You pass a starting and ending index to <code>list_range()</code> and negative indices can be used to count backwards from the end just as Ruby allows. Redis doesn't allow you to read a full list with a simple key lookup (<code>[]</code> or <code>get()</code>), so use <code>list_range(…, 0, -1)</code> instead.</p>
<p>Here is what this example looks like in practice after I've entered a few comments:</p>
<pre><code>My Favorite Language:
I love Ruby!
----
First!
posted by JEG2 on 09/07/2009 at 15:35:11
----
Yeah, we know.
posted by JEG2 on 09/07/2009 at 15:35:29
Command? [(c)omment, (q)uit, (n)ext]
</code></pre>
<p>Then if I type an <code>n</code> followed by a <code>return</code>:</p>
<pre><code>My Favorite Language:
I love Ruby!
----
ruby { |love| love + 1 }
posted by JEG2 on 09/07/2009 at 15:35:52
----
Who doesn't?
posted by JEG2 on 09/07/2009 at 15:36:20
Command? [(c)omment, (q)uit, (p)revious, (n)ext]
</code></pre>
<p>You get the idea.</p>
<p>Let's go back to how I added items to the list for just a moment though.<br><br>
As the name <code>push_tail()</code> might lead you to guess, there are three related methods: <code>pop_tail()</code>, <code>push_head()</code>, <code>pop_head()</code>. The <code>push_head()</code> method will add items to the beginning of the list, instead of the end as I did above. Then <code>pop_head()</code> and <code>pop_tail()</code> can be used to remove entries from either end. This means that lists can function as stacks and queues, just as Ruby's <code>Array</code> does.</p>
<p>I believe using Redis lists as queues will be another popular setup:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="no">WORKER_COUNT</span> <span class="o">=</span> <span class="mi">3</span>
<span class="c1"># spawn some workers</span>
<span class="no">WORKER_COUNT</span><span class="o">.</span><span class="n">times</span> <span class="k">do</span>
<span class="nb">fork</span> <span class="k">do</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="kp">loop</span> <span class="k">do</span>
<span class="k">next</span> <span class="k">unless</span> <span class="n">work</span> <span class="o">=</span> <span class="n">db</span><span class="o">.</span><span class="n">pop_head</span><span class="p">(</span><span class="ss">:work</span><span class="p">)</span>
<span class="k">break</span> <span class="k">if</span> <span class="n">work</span> <span class="o">==</span> <span class="s2">"QUIT"</span>
<span class="nb">puts</span> <span class="s2">"Work from </span><span class="si">#{</span><span class="no">Process</span><span class="o">.</span><span class="n">pid</span><span class="si">}</span><span class="s2">: </span><span class="si">#{</span><span class="n">work</span><span class="si">}</span><span class="s2"> = </span><span class="si">#{</span><span class="nb">eval</span> <span class="n">work</span><span class="si">}</span><span class="s2">"</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="c1"># generate some work</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="mi">10</span><span class="o">.</span><span class="n">times</span> <span class="k">do</span>
<span class="n">db</span><span class="o">.</span><span class="n">push_tail</span><span class="p">(</span> <span class="ss">:work</span><span class="p">,</span>
<span class="s2">"</span><span class="si">#{</span><span class="nb">rand</span><span class="p">(</span><span class="mi">10</span><span class="p">)</span><span class="si">}</span><span class="s2"> </span><span class="si">#{</span><span class="sx">%w[+ - * / %]</span><span class="o">[</span><span class="nb">rand</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span><span class="o">]</span><span class="si">}</span><span class="s2"> </span><span class="si">#{</span><span class="nb">rand</span><span class="p">(</span><span class="mi">9</span><span class="p">)</span> <span class="o">+</span> <span class="mi">1</span><span class="si">}</span><span class="s2">"</span> <span class="p">)</span>
<span class="k">end</span>
<span class="c1"># finish off all processes</span>
<span class="no">WORKER_COUNT</span><span class="o">.</span><span class="n">times</span> <span class="k">do</span>
<span class="n">db</span><span class="o">.</span><span class="n">push_tail</span><span class="p">(</span><span class="ss">:work</span><span class="p">,</span> <span class="s2">"QUIT"</span><span class="p">)</span>
<span class="k">end</span>
<span class="no">Process</span><span class="o">.</span><span class="n">waitall</span>
</pre></div>
<p>When I run that, we can see the workers pulling off their assignments and taking care of the work:</p>
<pre><code>Work from 1478: 9 + 9 = 18
Work from 1479: 0 / 7 = 0
Work from 1478: 1 % 3 = 1
Work from 1479: 2 % 4 = 2
Work from 1479: 4 * 7 = 28
Work from 1478: 1 / 9 = 0
Work from 1479: 1 + 1 = 2
Work from 1479: 3 * 4 = 12
Work from 1478: 6 - 9 = -3
Work from 1480: 0 / 1 = 0
</code></pre>
<p>Adding and removing work are both atomic actions. Once a work has a job, it's guaranteed other workers won't receive it.</p>
<p>You aren't limited to a single queue either, of course. Each key can hold a queue, so you can divide jobs up by priority, time required, or anything else that makes sense.</p>
<p>I know of at least one big site using this kind of setup to process their background jobs. They switched to this approach because just saving the jobs to a relational database was too much of a time penalty for their needs.</p>
<p>You can edit a list by index using <code>list_set()</code>, but unlike in Ruby, you will get an error if you try to set an index that doesn't already exist. Use <code>list_index()</code> to read a single value at an index.</p>
<p>If you want to use lists as a circular buffer, say for storing recent log entries or notices without allowing the data to grow infinitely, the <code>list_trim()</code> method is what you need. It takes identical arguments to <code>list_range()</code>, but instead of returning the indicated keys it shaves the list down to include only those keys. Thus, if you want to keep a list at no more than 100 entries, you can use code like:</p>
<div class="highlight highlight-ruby"><pre><span class="n">db</span><span class="o">.</span><span class="n">push_tail</span><span class="p">(</span><span class="ss">:list</span><span class="p">,</span> <span class="s2">"whatever"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">list_trim</span><span class="p">(</span><span class="ss">:list</span><span class="p">,</span> <span class="o">-</span><span class="mi">100</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span>
</pre></div>
<p>If you need to delete items out of the middle of a list, use the <code>list_rm()</code> method. It takes the key, a count of how many matching values to remove, and the value to match against. A count of <code>0</code> will remove all matching values from within the list and negative counts will start removing at the tail of the list, moving toward the head.</p>
<p>Finally, you may want to know about the <code>rename()</code> method, in case you need to atomically replace an entire list. You can build up the new list under a separate key an then just use <code>rename()</code> to replace the old key. Be careful with <code>rename()</code> though because it expects the old key name followed by the new key name, which is backwards from how we usually do things in Ruby. If you need a <code>rename()</code> that won't destroy old data, there's also a <code>rename_unless_exists()</code> method.</p>
<h4>Sets</h4>
<p>Redis has one more type of collection: sets. Sets are probably even simpler to work with operation wise, but are a large source in the power of Redis over normal key-value stores.</p>
<p>To Redis a set is an unordered collection with no duplicate members. If you add an item to a set more than once, it will still just be listed one time.</p>
<p>Let's begin by looking at some basic set operations:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_member?</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"one"</span><span class="p">)</span> <span class="c1"># => false</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"one"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_member?</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"one"</span><span class="p">)</span> <span class="c1"># => 1</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_delete</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"one"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_member?</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"one"</span><span class="p">)</span> <span class="c1"># => false</span>
<span class="mi">50</span><span class="o">.</span><span class="n">times</span> <span class="k">do</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"one"</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"two"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_count</span><span class="p">(</span><span class="ss">:nums</span><span class="p">)</span> <span class="c1"># => 2</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_members</span><span class="p">(</span><span class="ss">:nums</span><span class="p">)</span> <span class="c1"># => ["two", "one"]</span>
<span class="n">db</span><span class="o">.</span><span class="n">spop</span><span class="p">(</span><span class="ss">:nums</span><span class="p">)</span> <span class="c1"># => "two"</span>
<span class="n">db</span><span class="o">.</span><span class="n">spop</span><span class="p">(</span><span class="ss">:nums</span><span class="p">)</span> <span class="c1"># => "one"</span>
<span class="n">db</span><span class="o">.</span><span class="n">spop</span><span class="p">(</span><span class="ss">:nums</span><span class="p">)</span> <span class="c1"># => nil</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:work</span><span class="p">,</span> <span class="s2">"write about sets"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:work</span><span class="p">,</span> <span class="s2">"write about sorting"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_members</span><span class="p">(</span><span class="ss">:work</span><span class="p">)</span> <span class="c1"># => ["write about sorting", "write about sets"]</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_move</span><span class="p">(</span><span class="ss">:work</span><span class="p">,</span> <span class="ss">:finished</span><span class="p">,</span> <span class="s2">"write about sets"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_members</span><span class="p">(</span><span class="ss">:work</span><span class="p">)</span> <span class="c1"># => ["write about sorting"]</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_members</span><span class="p">(</span><span class="ss">:finished</span><span class="p">)</span> <span class="c1"># => ["write about sets"]</span>
</pre></div>
<p>I assume each of those examples is pretty easy to follow. The first chunk of code shows adding to and deleting from a set, plus testing membership. After that we see the rules I mentioned earlier: no duplicates and unordered. Next we see the <code>spop()</code> method which is just a random member remover function. The final chunk of code shows how you can atomically move members from one set to another. Those are the basics of sets.</p>
<p>The real power of sets comes from how you can use them to build simple queries in Redis. That's possible through the support of some atomic set operations: intersect(ion), union, and diff(erence). Here are some examples:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:odd</span><span class="p">,</span> <span class="s2">"one"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:even</span><span class="p">,</span> <span class="s2">"two"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:odd</span><span class="p">,</span> <span class="s2">"three"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:divisible_by_3</span><span class="p">,</span> <span class="s2">"three"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:even</span><span class="p">,</span> <span class="s2">"four"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:divisible_by_4</span><span class="p">,</span> <span class="s2">"four"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:odd</span><span class="p">,</span> <span class="s2">"five"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:even</span><span class="p">,</span> <span class="s2">"six"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:divisible_by_3</span><span class="p">,</span> <span class="s2">"six"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:odd</span><span class="p">,</span> <span class="s2">"seven"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:even</span><span class="p">,</span> <span class="s2">"eight"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:divisible_by_4</span><span class="p">,</span> <span class="s2">"eight"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:odd</span><span class="p">,</span> <span class="s2">"nine"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:divisible_by_3</span><span class="p">,</span> <span class="s2">"nine"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:even</span><span class="p">,</span> <span class="s2">"ten"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:odd</span><span class="p">,</span> <span class="s2">"eleven"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:even</span><span class="p">,</span> <span class="s2">"twelve"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:divisible_by_3</span><span class="p">,</span> <span class="s2">"twelve"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:divisible_by_4</span><span class="p">,</span> <span class="s2">"twelve"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_intersect</span><span class="p">(</span><span class="ss">:even</span><span class="p">,</span> <span class="ss">:divisible_by_3</span><span class="p">)</span> <span class="c1"># => ["six",</span>
<span class="c1"># "twelve"]</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_intersect</span><span class="p">(</span><span class="ss">:even</span><span class="p">,</span> <span class="ss">:divisible_by_3</span><span class="p">,</span> <span class="ss">:divisible_by_4</span><span class="p">)</span> <span class="c1"># => ["twelve"]</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_diff</span><span class="p">(</span><span class="ss">:divisible_by_3</span><span class="p">,</span> <span class="ss">:even</span><span class="p">)</span> <span class="c1"># => ["three", "nine"]</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_diff</span><span class="p">(</span><span class="ss">:even</span><span class="p">,</span> <span class="ss">:divisible_by_3</span><span class="p">)</span> <span class="c1"># => ["four", "ten",</span>
<span class="c1"># "eight", "two"]</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_diff</span><span class="p">(</span><span class="ss">:even</span><span class="p">,</span> <span class="ss">:divisible_by_3</span><span class="p">,</span> <span class="ss">:divisible_by_4</span><span class="p">)</span> <span class="c1"># => ["ten", "two"]</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_union</span><span class="p">(</span><span class="ss">:divisible_by_3</span><span class="p">,</span> <span class="ss">:divisible_by_4</span><span class="p">)</span> <span class="c1"># => ["four", "six",</span>
<span class="c1"># "twelve", "three",</span>
<span class="c1"># "eight", "nine"]</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_union_store</span><span class="p">(</span><span class="ss">:all</span><span class="p">,</span> <span class="ss">:even</span><span class="p">,</span> <span class="ss">:odd</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_members</span><span class="p">(</span><span class="ss">:all</span><span class="p">)</span> <span class="c1"># => ["four", "eleven", "seven", "eight", "one",</span>
<span class="c1"># "six", "ten", "twelve", "three", "five", "nine",</span>
<span class="c1"># "two"]</span>
</pre></div>
<p>Again, I assume the results of these examples are pretty straight forward. <code>set_intersect()</code> returns the members in all listed sets, <code>set_diff()</code> subtracts the members of each successive set listed from the first set, and <code>set_union()</code> returns all members present in any of the listed sets. Note that order is significant in <code>set_diff()</code> due to the way it is defined. Finally, I showed a variant that stores the results instead of returning the entire set. Though I only showed <code>set_union_store()</code>, <code>set_intersect_store()</code> and <code>set_diff_store()</code> do exist.</p>
<p>A typical usage for these methods is to store unique identifiers of records in the system by various categorical breakdowns. You can then use the set operations to find simple query results of those present in more than one category, in one category but not others, or those present in any category. This may be able to save you a trip to a more powerful but expensive tool, like SQL, for some queries.</p>
<p>To give an example of building queries with sets, let's look at a simple program. This code will load <a href="http://www.census.gov/genealogy/names/names_files.html">some first and last names from the U.S Census Bureau</a> into a Redis database. We will then allow the user to make queries by name position (first or last), gender, popularity, and any leading prefix.</p>
<p>To keep the code simple, we're going to adopt a pretty crude definition of popularity. The files are already in order of rank, so we will just call the first third of the file popular, the second third average, and the rest uncommon.</p>
<p>We're also going to need to be a bit clever to handle prefix searches with just sets. Redis doesn't really have text search features, so we will need build an index we can use. The idea is simple: given the name <code>"James"</code> we will add it to the sets <code>"prefix:j"</code>, <code>"prefix:ja"</code>, <code>"prefix:jam"</code>, <code>"prefix:jame"</code>, and <code>"prefix:james"</code>. Later, when we are answering queries, we can just add the set for the input the user gave us into our set intersection.</p>
<p>Here's the code:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"abbrev"</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="c1"># prepare the database</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="k">if</span> <span class="n">db</span><span class="o">.</span><span class="n">dbsize</span><span class="o">.</span><span class="n">zero?</span>
<span class="nb">puts</span> <span class="s2">"Loading names into the database..."</span>
<span class="n">started</span><span class="p">,</span> <span class="n">count</span> <span class="o">=</span> <span class="no">Time</span><span class="o">.</span><span class="n">now</span><span class="p">,</span> <span class="mi">0</span>
<span class="c1"># load names</span>
<span class="sx">%w[all.last female.first male.first]</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">group</span><span class="o">|</span>
<span class="n">gender</span><span class="p">,</span> <span class="n">position</span> <span class="o">=</span> <span class="n">group</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s2">"."</span><span class="p">)</span>
<span class="n">category</span> <span class="o">=</span> <span class="p">(</span><span class="o">[</span><span class="n">position</span><span class="p">,</span> <span class="n">gender</span><span class="o">]</span> <span class="o">-</span> <span class="sx">%w[all]</span><span class="p">)</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="s2">":"</span><span class="p">)</span>
<span class="n">top_third</span> <span class="o">=</span> <span class="no">File</span><span class="o">.</span><span class="n">size</span><span class="p">(</span><span class="s2">"dist.</span><span class="si">#{</span><span class="n">group</span><span class="si">}</span><span class="s2">.txt"</span><span class="p">)</span> <span class="o">/</span> <span class="mi">3</span>
<span class="no">File</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s2">"dist.</span><span class="si">#{</span><span class="n">group</span><span class="si">}</span><span class="s2">.txt"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">names</span><span class="o">|</span>
<span class="n">names</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">given</span><span class="o">|</span>
<span class="nb">name</span> <span class="o">=</span> <span class="n">given</span><span class="o">[</span><span class="sr">/\w+/</span><span class="o">]</span>
<span class="n">db</span><span class="o">.</span><span class="n">pipelined</span> <span class="k">do</span> <span class="o">|</span><span class="n">commands</span><span class="o">|</span>
<span class="n">commands</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="s2">"position:</span><span class="si">#{</span><span class="n">position</span><span class="si">}</span><span class="s2">"</span><span class="p">,</span> <span class="nb">name</span><span class="p">)</span>
<span class="n">commands</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="s2">"gender:</span><span class="si">#{</span><span class="n">gender</span><span class="si">}</span><span class="s2">"</span><span class="p">,</span> <span class="nb">name</span><span class="p">)</span> <span class="p">\</span>
<span class="k">unless</span> <span class="n">gender</span> <span class="o">==</span> <span class="s2">"all"</span>
<span class="no">Abbrev</span><span class="o">.</span><span class="n">abbrev</span><span class="p">(</span><span class="nb">name</span><span class="o">.</span><span class="n">downcase</span><span class="p">)</span><span class="o">.</span><span class="n">keys</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">prefix</span><span class="o">|</span>
<span class="n">commands</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="s2">"prefix:</span><span class="si">#{</span><span class="n">prefix</span><span class="si">}</span><span class="s2">"</span><span class="p">,</span> <span class="nb">name</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">popularity</span> <span class="o">=</span> <span class="k">if</span> <span class="n">names</span><span class="o">.</span><span class="n">pos</span> <span class="o"><</span> <span class="n">top_third</span> <span class="k">then</span> <span class="s2">"popular"</span>
<span class="k">elsif</span> <span class="n">names</span><span class="o">.</span><span class="n">pos</span> <span class="o"><</span> <span class="n">top_third</span> <span class="o">*</span> <span class="mi">2</span> <span class="k">then</span> <span class="s2">"average"</span>
<span class="k">else</span> <span class="s2">"uncommon"</span>
<span class="k">end</span>
<span class="n">commands</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="s2">"popularity:</span><span class="si">#{</span><span class="n">category</span><span class="si">}</span><span class="s2">:</span><span class="si">#{</span><span class="n">popularity</span><span class="si">}</span><span class="s2">"</span><span class="p">,</span> <span class="nb">name</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">count</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="c1"># save and report</span>
<span class="n">db</span><span class="o">.</span><span class="n">save</span>
<span class="nb">puts</span> <span class="s2">"Loaded </span><span class="si">#{</span><span class="n">count</span><span class="si">}</span><span class="s2"> names in </span><span class="si">#{</span><span class="no">Time</span><span class="o">.</span><span class="n">now</span> <span class="o">-</span> <span class="n">started</span><span class="si">}</span><span class="s2"> seconds."</span>
<span class="nb">puts</span>
<span class="k">end</span>
<span class="c1"># perform queries</span>
<span class="n">position</span> <span class="o">=</span> <span class="sx">%w[first last]</span>
<span class="n">gender</span> <span class="o">=</span> <span class="sx">%w[male female]</span>
<span class="n">popularity</span> <span class="o">=</span> <span class="sx">%w[popular average uncommon]</span>
<span class="c1"># UI for entering queries</span>
<span class="k">def</span> <span class="nf">ask</span><span class="p">(</span><span class="n">choices</span><span class="p">)</span>
<span class="nb">print</span> <span class="o">[</span> <span class="n">choices</span><span class="o">[</span><span class="mi">0</span><span class="o">.</span><span class="n">.</span><span class="o">-</span><span class="mi">2</span><span class="o">].</span><span class="n">join</span><span class="p">(</span><span class="s2">", "</span><span class="p">),</span>
<span class="n">choices</span><span class="o">[-</span><span class="mi">1</span><span class="o">]</span> <span class="o">].</span><span class="n">join</span><span class="p">(</span><span class="s2">", or "</span><span class="p">)</span><span class="o">.</span><span class="n">capitalize</span> <span class="o">+</span> <span class="s2">"? "</span>
<span class="n">choice</span> <span class="o">=</span> <span class="nb">gets</span><span class="o">.</span><span class="n">to_s</span><span class="o">.</span><span class="n">strip</span>
<span class="n">choice</span><span class="o">.</span><span class="n">empty?</span> <span class="p">?</span> <span class="kp">nil</span> <span class="p">:</span> <span class="no">Abbrev</span><span class="o">.</span><span class="n">abbrev</span><span class="p">(</span><span class="n">choices</span><span class="p">)</span><span class="o">[</span><span class="n">choice</span><span class="o">]</span>
<span class="k">end</span>
<span class="n">query</span> <span class="o">=</span> <span class="o">[</span> <span class="o">]</span>
<span class="k">if</span> <span class="n">choice</span> <span class="o">=</span> <span class="n">ask</span><span class="p">(</span><span class="n">position</span><span class="p">)</span>
<span class="n">query</span> <span class="o"><<</span> <span class="s2">"position:</span><span class="si">#{</span><span class="n">choice</span><span class="si">}</span><span class="s2">"</span>
<span class="k">end</span>
<span class="k">unless</span> <span class="n">choice</span> <span class="o">==</span> <span class="s2">"last"</span>
<span class="k">if</span> <span class="n">choice</span> <span class="o">=</span> <span class="n">ask</span><span class="p">(</span><span class="n">gender</span><span class="p">)</span>
<span class="n">query</span> <span class="o"><<</span> <span class="s2">"gender:</span><span class="si">#{</span><span class="n">choice</span><span class="si">}</span><span class="s2">"</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">if</span> <span class="n">choice</span> <span class="o">=</span> <span class="n">ask</span><span class="p">(</span><span class="n">popularity</span><span class="p">)</span>
<span class="n">query</span> <span class="o"><<</span> <span class="s2">"popularity:</span><span class="si">#{</span><span class="n">query</span><span class="o">.</span><span class="n">map</span> <span class="p">{</span> <span class="o">|</span><span class="n">f</span><span class="o">|</span> <span class="n">f</span><span class="o">[</span><span class="sr">/\w+\z/</span><span class="o">]</span> <span class="si">}</span><span class="s2">.join(':')}:</span><span class="si">#{</span><span class="n">choice</span><span class="si">}</span><span class="s2">"</span>
<span class="k">end</span>
<span class="nb">print</span> <span class="s2">"Prefix? "</span>
<span class="n">prefix</span> <span class="o">=</span> <span class="nb">gets</span><span class="o">.</span><span class="n">to_s</span><span class="o">.</span><span class="n">strip</span>
<span class="k">unless</span> <span class="n">prefix</span><span class="o">.</span><span class="n">empty?</span>
<span class="n">query</span> <span class="o"><<</span> <span class="s2">"prefix:</span><span class="si">#{</span><span class="n">prefix</span><span class="o">.</span><span class="n">downcase</span><span class="si">}</span><span class="s2">"</span>
<span class="k">end</span>
<span class="nb">puts</span>
<span class="c1"># execute query and show stats</span>
<span class="nb">puts</span> <span class="s2">"Running query..."</span>
<span class="n">width</span> <span class="o">=</span> <span class="n">query</span><span class="o">.</span><span class="n">map</span> <span class="p">{</span> <span class="o">|</span><span class="n">set</span><span class="o">|</span> <span class="n">set</span><span class="o">.</span><span class="n">size</span> <span class="p">}</span><span class="o">.</span><span class="n">max</span>
<span class="n">query</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">set</span><span class="o">|</span>
<span class="nb">puts</span> <span class="s2">" %</span><span class="si">#{</span><span class="n">width</span><span class="si">}</span><span class="s2">s: </span><span class="si">#{</span><span class="n">db</span><span class="o">.</span><span class="n">set_count</span><span class="p">(</span><span class="n">set</span><span class="p">)</span><span class="si">}</span><span class="s2"> names"</span> <span class="o">%</span> <span class="n">set</span>
<span class="k">end</span>
<span class="n">started</span> <span class="o">=</span> <span class="no">Time</span><span class="o">.</span><span class="n">now</span>
<span class="nb">puts</span>
<span class="nb">puts</span><span class="p">(</span> <span class="n">query</span><span class="o">.</span><span class="n">empty?</span> <span class="p">?</span> <span class="n">db</span><span class="o">.</span><span class="n">set_union</span><span class="p">(</span><span class="s2">"position:first"</span><span class="p">,</span> <span class="s2">"position:last"</span><span class="p">)</span> <span class="p">:</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_intersect</span><span class="p">(</span><span class="o">*</span><span class="n">query</span><span class="p">)</span> <span class="p">)</span>
<span class="nb">puts</span>
<span class="nb">puts</span> <span class="s2">"</span><span class="si">#{</span><span class="no">Time</span><span class="o">.</span><span class="n">now</span> <span class="o">-</span> <span class="n">started</span><span class="si">}</span><span class="s2"> seconds."</span>
</pre></div>
<p>Using that code, we can load the database and search for names like mine:</p>
<pre><code>$ ruby names.rb
Loading names into the database...
Loaded 94293 names in 58.25652 seconds.
First, or last? f
Male, or female? m
Popular, average, or uncommon? p
Prefix? Ja
Running query...
position:first: 5163 names
gender:male: 1219 names
popularity:first:male:popular: 406 names
prefix:ja: 501 names
JARED
JACKIE
JAVIER
JAY
JAIME
JAMES
JACK
JAMIE
JACOB
JASON
0.003345 seconds.
</code></pre>
<p>As you can see, a little setup work really pays us back. Redis can chew through those multi-set queries in no time at all.</p>
<p>Since we've now seen all of the Redis value types, it's worth a quick mention that you can ask the database for the <code>type()</code> of a given key. The response will be one of: <code>"none"</code> (key not set), <code>"string"</code>, <code>"list"</code>, or <code>"set"</code>.</p>
<h4>Sorting</h4>
<p>As soon as you have collections of data, you will want tools for controlling the order of those collections. This is especially important with things like set that don't have an order until you define one.</p>
<p>Redis has a single method for this, called <code>sort()</code>, but it's by far the most complex and powerful method in the entire API. Let's examine what it can do piece by piece. First, here is a basic sort:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"1"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"2"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"10"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"11"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span><span class="ss">:nums</span><span class="p">)</span> <span class="c1"># => ["1", "2", "10", "11"]</span>
</pre></div>
<p>That's easy enough to understand. I built a set (note that <code>sort()</code> works with lists too) and asked Redis to sort the members. It did.</p>
<p>Here's a question though: did the order surprise you? Values are generally considered <code>String</code>s in Redis, but those numbers weren't sorted as <code>String</code>s. This is another one of those sometimes numeric meaning exceptions I've mentioned before. Redis goes one step further with <code>sort()</code> though and even recognizes <code>Float</code>s:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:floats</span><span class="p">,</span> <span class="s2">"3"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:floats</span><span class="p">,</span> <span class="s2">"3.02"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:floats</span><span class="p">,</span> <span class="s2">"3.14"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:floats</span><span class="p">,</span> <span class="s2">"3.2"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:floats</span><span class="p">,</span> <span class="s2">"4"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span><span class="ss">:floats</span><span class="p">)</span> <span class="c1"># => ["3", "3.02", "3.14", "3.2", "4"]</span>
</pre></div>
<p>Now, if you want a <code>String</code> ordering, you can ask for it. You can also reverse either order. Here's how those options work in our original example:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"1"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"2"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"10"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"11"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="ss">:order</span> <span class="o">=></span> <span class="s2">"ALPHA"</span><span class="p">)</span> <span class="c1"># => ["1", "10", "11", "2"]</span>
<span class="n">db</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="ss">:order</span> <span class="o">=></span> <span class="s2">"ALPHA DESC"</span><span class="p">)</span> <span class="c1"># => ["2", "11", "10", "1"]</span>
<span class="n">db</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="ss">:order</span> <span class="o">=></span> <span class="s2">"DESC"</span><span class="p">)</span> <span class="c1"># => ["11", "10", "2", "1"]</span>
</pre></div>
<p>The <code>sort()</code> method also supports a limit with offset that can be used to fetch a subset of entries and also handle pagination:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"1"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"2"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"10"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="s2">"11"</span><span class="p">)</span>
<span class="n">db</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="ss">:limit</span> <span class="o">=></span> <span class="o">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">3</span><span class="o">]</span><span class="p">)</span> <span class="c1"># => ["1", "2", "10"]</span>
<span class="n">db</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="ss">:limit</span> <span class="o">=></span> <span class="o">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">3</span><span class="o">]</span><span class="p">)</span> <span class="c1"># => ["11"]</span>
</pre></div>
<p>There are two more options you can set with <code>sort()</code>. The first option allows you to use a key lookup to find the actual value to order by. Thus, if you had a set of ID's, but actually wanted to <code>sort()</code> by an attribute associated with those ID's, you could look up the attribute. Let me show you what I mean:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="sx">%w[one two three four]</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">num</span><span class="o">|</span>
<span class="nb">id</span> <span class="o">=</span> <span class="n">db</span><span class="o">.</span><span class="n">incr</span><span class="p">(</span><span class="s2">"global:num_id"</span><span class="p">)</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"num:</span><span class="si">#{</span><span class="nb">id</span><span class="si">}</span><span class="s2">:word"</span><span class="o">]</span> <span class="o">=</span> <span class="n">num</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"num:</span><span class="si">#{</span><span class="nb">id</span><span class="si">}</span><span class="s2">:word_size"</span><span class="o">]</span> <span class="o">=</span> <span class="n">num</span><span class="o">.</span><span class="n">size</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="nb">id</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">db</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="ss">:by</span> <span class="o">=></span> <span class="s2">"num:*:word_size"</span><span class="p">)</span> <span class="c1"># => ["1", "2", "4", "3"]</span>
</pre></div>
<p>I made the ID's match up with the numbers here just to make the example easy to follow, but notice how those ID's where actually orders by their <code>"num:ID:word_size"</code> key. As you can see, <code>sort()</code> replaced the <code>*</code> in my key name with the actual member of the set, which was the ID in this case.</p>
<p>The final feature lets us take that one step further. Just as we can fetch a key for the order, we can also fetch a key for the result. That allows us to return not just the ID, but the actual data. For example:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="sx">%w[one two three four]</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">num</span><span class="o">|</span>
<span class="nb">id</span> <span class="o">=</span> <span class="n">db</span><span class="o">.</span><span class="n">incr</span><span class="p">(</span><span class="s2">"global:num_id"</span><span class="p">)</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"num:</span><span class="si">#{</span><span class="nb">id</span><span class="si">}</span><span class="s2">:word"</span><span class="o">]</span> <span class="o">=</span> <span class="n">num</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"num:</span><span class="si">#{</span><span class="nb">id</span><span class="si">}</span><span class="s2">:word_size"</span><span class="o">]</span> <span class="o">=</span> <span class="n">num</span><span class="o">.</span><span class="n">size</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="nb">id</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">db</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span> <span class="ss">:nums</span><span class="p">,</span> <span class="ss">:by</span> <span class="o">=></span> <span class="s2">"num:*:word_size"</span><span class="p">,</span>
<span class="ss">:get</span> <span class="o">=></span> <span class="s2">"num:*:word"</span> <span class="p">)</span> <span class="c1"># => ["one", "two", "four", "three"]</span>
</pre></div>
<p>So the <code>sort()</code> fetched the <code>word_size</code> keys to build the ordering and then fetched the <code>word</code> keys as the actual result. If you want, you can even fetch multiple keys for each result:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="sx">%w[one two three four]</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">num</span><span class="o">|</span>
<span class="nb">id</span> <span class="o">=</span> <span class="n">db</span><span class="o">.</span><span class="n">incr</span><span class="p">(</span><span class="s2">"global:num_id"</span><span class="p">)</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"num:</span><span class="si">#{</span><span class="nb">id</span><span class="si">}</span><span class="s2">:word"</span><span class="o">]</span> <span class="o">=</span> <span class="n">num</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"num:</span><span class="si">#{</span><span class="nb">id</span><span class="si">}</span><span class="s2">:word_size"</span><span class="o">]</span> <span class="o">=</span> <span class="n">num</span><span class="o">.</span><span class="n">size</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_add</span><span class="p">(</span><span class="ss">:nums</span><span class="p">,</span> <span class="nb">id</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">db</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span> <span class="ss">:nums</span><span class="p">,</span> <span class="ss">:by</span> <span class="o">=></span> <span class="s2">"num:*:word_size"</span><span class="p">,</span>
<span class="ss">:get</span> <span class="o">=></span> <span class="sx">%w[ num:*:word</span>
<span class="sx"> num:*:word_size ]</span> <span class="p">)</span> <span class="c1"># => ["one", "3",</span>
<span class="c1"># "two", "3",</span>
<span class="c1"># "four", "4",</span>
<span class="c1"># "three", "5"]</span>
</pre></div>
<p><code>sort()</code> is the power tool of collection ordering and fetching. I've shown most of the options separately to make things easier to understand, but you can combine them as needed to order, limit, and even lookup your data.</p>
<p>The collections types add quite a bit of flexibility to simple key-value storage. You can use these tools to group keys and process them by various criteria. This makes Redis useable in a wider range of scenarios than some simple key-value stores.</p>James Edward Gray IIUsing Redis as a Key-Value Storetag:graysoftinc.com,2009-09-15:/posts/892014-04-18T22:27:14ZThis article covers basic Redis usage. Most key-value stores have these features in common.<p><em>[<strong>Update</strong>: though all of the techniques I show here still apply, many methods of the Redis gem have changed names to match the actual <a href="http://redis.io/commands">Redis commands</a> they call.]</em></p>
<p>Redis is a first and foremost a server providing key-value storage. As such, the primary features of any client library are for connecting to the server and manipulating those key-value pairs.</p>
<h4>Connecting to the Server</h4>
<p>Connecting to the Redis server can be as simple as <code>Redis.new</code>, thanks to some defaults in both the server and Ezra's Ruby client library for talking to that server. I won't pass any options to the constructor calls below, but you can use any of the following as needed:</p>
<ul>
<li>
<code>:host</code> if you need to connect to an external host instead of the default 127.0.0.1</li>
<li>
<code>:port</code> if you need to use something other than the default port of 6379</li>
<li>
<code>:password</code> if you configured Redis to require a password on connection</li>
<li>
<code>:db</code> if you want to select one of the multiple configured databases, other than the default of 0 (databases are identified by a zero-based index)</li>
<li>
<code>:timeoeut</code> if you want a different timeout for Redis communication than the default of 5 seconds</li>
<li>
<code>:logger</code> if you want the library to log activity as it works</li>
</ul><h4>Getting and Setting Keys</h4>
<p>Once connected to, Redis can be used as an in-memory key-value store, much like memcached. The client library exposes this key getting and setting functionality just like a Ruby <code>Hash</code>:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:my_key</span><span class="o">]</span> <span class="c1"># => nil</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:my_key</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"my_value"</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:my_key</span><span class="o">]</span> <span class="c1"># => "my_value"</span>
<span class="n">db</span><span class="o">.</span><span class="n">delete</span><span class="p">(</span><span class="ss">:my_key</span><span class="p">)</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:my_key</span><span class="o">]</span> <span class="c1"># => nil</span>
</pre></div>
<p>Notice that we can read, write, and delete key-value pairs just as we could with a <code>Hash</code>, using <code>[]</code>, <code>[]=</code>, and <code>delete()</code> respectively. If we look for a key that isn't in the database, we get <code>nil</code> just as Ruby would give us for a <code>Hash</code>.</p>
<p>There are two other methods for slightly more advanced setting operations. First, <code>getset()</code> can be used to update a value while also retrieving its previous value. There's also a <code>set_unless_exists()</code> method that will not override an existing value. Here are those operations in action:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:adv</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"old"</span>
<span class="n">db</span><span class="o">.</span><span class="n">getset</span><span class="p">(</span><span class="ss">:adv</span><span class="p">,</span> <span class="s2">"new"</span><span class="p">)</span> <span class="c1"># => "old"</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:adv</span><span class="o">]</span> <span class="c1"># => "new"</span>
<span class="n">db</span><span class="o">.</span><span class="n">set_unless_exists</span><span class="p">(</span><span class="ss">:adv</span><span class="p">,</span> <span class="s2">"lost"</span><span class="p">)</span> <span class="c1"># => false</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:adv</span><span class="o">]</span> <span class="c1"># => "new"</span>
</pre></div>
<p>Other <code>Hash</code>-like operations are supported. For example, you can check for the existence of a key, count the number of keys in the database, fetch a list of keys matching a pattern, or even get random keys:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:key1</span><span class="o">]</span> <span class="o">=</span> <span class="mi">1</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:key2</span><span class="o">]</span> <span class="o">=</span> <span class="mi">2</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:key3</span><span class="o">]</span> <span class="o">=</span> <span class="mi">3</span>
<span class="n">db</span><span class="o">[</span><span class="ss">:other_key</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"other"</span>
<span class="n">db</span><span class="o">.</span><span class="n">key?</span><span class="p">(</span><span class="ss">:key3</span><span class="p">)</span> <span class="c1"># => 1</span>
<span class="n">db</span><span class="o">.</span><span class="n">key?</span><span class="p">(</span><span class="ss">:key4</span><span class="p">)</span> <span class="c1"># => false</span>
<span class="n">db</span><span class="o">.</span><span class="n">dbsize</span> <span class="c1"># => 4</span>
<span class="n">db</span><span class="o">.</span><span class="n">keys</span><span class="p">(</span><span class="s2">"key*"</span><span class="p">)</span> <span class="c1"># => ["key2", "key3", "key1"]</span>
<span class="n">db</span><span class="o">.</span><span class="n">randkey</span> <span class="c1"># => "key2"</span>
<span class="n">db</span><span class="o">.</span><span class="n">randkey</span> <span class="c1"># => "other_key"</span>
</pre></div>
<p>Note that the pattern passed to <code>keys()</code> is similar to a file glob. You can use a <code>?</code> in the pattern to mean any one character and <code>*</code> to match any run of characters. You can also use <code>\\</code> to escape these special characters and match them literally.</p>
<p>It's worth noting that all Redis keys and values are pretty much <code>String</code>s:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="n">db</span><span class="o">[</span><span class="no">Object</span><span class="o">.</span><span class="n">new</span><span class="o">]</span> <span class="o">=</span> <span class="no">Object</span><span class="o">.</span><span class="n">new</span>
<span class="n">k</span> <span class="o">=</span> <span class="n">db</span><span class="o">.</span><span class="n">keys</span><span class="p">(</span><span class="s2">"#*"</span><span class="p">)</span><span class="o">.</span><span class="n">first</span> <span class="c1"># => "#<Object:0x301894>"</span>
<span class="n">db</span><span class="o">[</span><span class="n">k</span><span class="o">]</span> <span class="c1"># => "#<Object:0x301858>"</span>
</pre></div>
<p>There are some minor exceptions where values can sometimes be treated as numbers. You can also have collections in values, but the collections hold the typical <code>String</code>s, with sometimes numeric meaning. I'll talk more about these cases later.</p>
<h4>Key Expiration</h4>
<p>Redis supports setting expiration times on stored keys. When that time expires, the key will be purged. This is very useful when using Redis as a cache. You can set an expiration time by calling the <code>expire()</code> method, or you can just use a different version of the key setting that includes the timeout:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="n">db</span><span class="o">.</span><span class="n">set</span><span class="p">(</span><span class="ss">:cached</span><span class="p">,</span> <span class="s2">"short lived"</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span>
<span class="mi">4</span><span class="o">.</span><span class="n">times</span> <span class="k">do</span>
<span class="nb">sleep</span> <span class="mi">1</span>
<span class="nb">puts</span> <span class="s2">"db[:cached] is </span><span class="si">#{</span><span class="n">db</span><span class="o">[</span><span class="ss">:cached</span><span class="o">].</span><span class="n">inspect</span><span class="si">}</span><span class="s2"> at </span><span class="si">#{</span><span class="no">Time</span><span class="o">.</span><span class="n">now</span><span class="si">}</span><span class="s2">"</span>
<span class="k">end</span>
<span class="c1"># >> db[:cached] is "short lived" at Sat Sep 05 14:01:07 -0500 2009</span>
<span class="c1"># >> db[:cached] is "short lived" at Sat Sep 05 14:01:08 -0500 2009</span>
<span class="c1"># >> db[:cached] is "short lived" at Sat Sep 05 14:01:09 -0500 2009</span>
<span class="c1"># >> db[:cached] is nil at Sat Sep 05 14:01:10 -0500 2009</span>
</pre></div>
<p>Note that a write operation against a key with an expiration timeout set, a <em>volatile key</em> in Redis parlance, clears the timeout. You can use the <code>ttl()</code> method if you need to examine the <em>time to live</em> for a key. There's also a matching <code>get()</code> method to go with the <code>set()</code> I used above, though it's just an alias for <code>[]</code> and has nothing to do with timeouts.</p>
<h4>Counters</h4>
<p>Redis supports some other interesting operations on simple keys. For example, you can use the atomic <code>incr()</code> operation to manage globally unique ID's:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="mi">3</span><span class="o">.</span><span class="n">times</span> <span class="k">do</span>
<span class="nb">fork</span> <span class="k">do</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="n">ids</span> <span class="o">=</span> <span class="nb">Array</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="mi">10</span><span class="p">)</span> <span class="p">{</span> <span class="n">db</span><span class="o">.</span><span class="n">incr</span><span class="p">(</span><span class="s2">"global:next_user_id"</span><span class="p">)</span> <span class="p">}</span>
<span class="nb">puts</span> <span class="s2">"</span><span class="si">#{</span><span class="no">Process</span><span class="o">.</span><span class="n">pid</span><span class="si">}</span><span class="s2">: </span><span class="si">#{</span><span class="n">ids</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="s1">', '</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="no">Process</span><span class="o">.</span><span class="n">waitall</span>
<span class="c1"># >> 1148: 1, 3, 6, 9, 12, 15, 17, 22, 25, 27</span>
<span class="c1"># >> 1147: 4, 8, 11, 13, 16, 18, 20, 23, 26, 29</span>
<span class="c1"># >> 1149: 2, 5, 7, 10, 14, 19, 21, 24, 28, 30</span>
</pre></div>
<p>This is one of the exceptions I mentioned earlier where Redis will try to treat a value as a number. In this case an <code>Integer</code> is expected and a <code>Float</code> will be truncated. If it holds non-numeric content, it is set to <code>"0"</code> and then modified as requested. That's why you can start with a key that doesn't exist, as I did above.</p>
<p>There is a matching <code>decr()</code> operation. You can also choose to pass an <code>Integer</code> as the second argument to these methods to raise or lower the count by that amount.</p>
<h4>Getting and Setting Multiple Keys at Once</h4>
<p>Another interesting feature is the ability to fetch more than one key at a time:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="n">db</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"user:1:username"</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"JEG2"</span>
<span class="n">db</span><span class="o">[</span><span class="s2">"user:1:password"</span><span class="o">]</span> <span class="o">=</span> <span class="s2">"secret"</span>
<span class="n">db</span><span class="o">.</span><span class="n">mget</span><span class="p">(</span><span class="s2">"user:1:username"</span><span class="p">,</span> <span class="s2">"user:1:password"</span><span class="p">)</span> <span class="c1"># => ["JEG2", "secret"]</span>
</pre></div>
<p>We can tie the counter and multiple get features together to do some basic object storage inside Redis:</p>
<div class="highlight highlight-ruby"><pre><span class="c1">#!/usr/bin/env ruby -wKU</span>
<span class="nb">require</span> <span class="s2">"redis"</span>
<span class="no">DB</span> <span class="o">=</span> <span class="no">Redis</span><span class="o">.</span><span class="n">new</span>
<span class="k">class</span> <span class="nc">User</span>
<span class="k">def</span> <span class="nf">initialize</span><span class="p">(</span><span class="nb">id</span> <span class="o">=</span> <span class="kp">nil</span><span class="p">)</span>
<span class="vi">@id</span> <span class="o">=</span> <span class="nb">id</span>
<span class="vi">@fields</span> <span class="o">=</span> <span class="no">Hash</span><span class="o">.</span><span class="n">new</span>
<span class="nb">load</span> <span class="k">if</span> <span class="vi">@id</span>
<span class="k">end</span>
<span class="kp">attr_reader</span> <span class="ss">:id</span>
<span class="k">def</span> <span class="nf">method_missing</span><span class="p">(</span><span class="n">meth</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">&</span><span class="n">blk</span><span class="p">)</span>
<span class="k">if</span> <span class="n">meth</span><span class="o">.</span><span class="n">to_s</span> <span class="o">=~</span> <span class="sr">/\A(\w+)=/</span>
<span class="vi">@fields</span><span class="o">[</span><span class="vg">$1</span><span class="o">]</span> <span class="o">=</span> <span class="n">args</span><span class="o">.</span><span class="n">first</span>
<span class="k">else</span>
<span class="vi">@fields</span><span class="o">[</span><span class="n">meth</span><span class="o">]</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">load</span>
<span class="n">keys</span> <span class="o">=</span> <span class="no">DB</span><span class="o">.</span><span class="n">keys</span><span class="p">(</span><span class="s2">"user:</span><span class="si">#{</span><span class="vi">@id</span><span class="si">}</span><span class="s2">:*"</span><span class="p">)</span>
<span class="n">values</span> <span class="o">=</span> <span class="no">DB</span><span class="o">.</span><span class="n">mget</span><span class="p">(</span><span class="o">*</span><span class="n">keys</span><span class="p">)</span>
<span class="vi">@fields</span> <span class="o">=</span> <span class="no">Hash</span><span class="o">[*</span><span class="n">keys</span><span class="o">.</span><span class="n">map</span> <span class="p">{</span> <span class="o">|</span><span class="n">k</span><span class="o">|</span> <span class="n">k</span><span class="o">[</span><span class="sr">/\w+\z/</span><span class="o">]</span> <span class="p">}</span><span class="o">.</span><span class="n">zip</span><span class="p">(</span><span class="n">values</span><span class="p">)</span><span class="o">.</span><span class="n">flatten</span><span class="o">]</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">save</span>
<span class="vi">@id</span> <span class="o">||=</span> <span class="no">DB</span><span class="o">.</span><span class="n">incr</span><span class="p">(</span><span class="s2">"global:next_user_id"</span><span class="p">)</span>
<span class="no">DB</span><span class="o">.</span><span class="n">pipelined</span> <span class="k">do</span> <span class="o">|</span><span class="n">commands</span><span class="o">|</span>
<span class="vi">@fields</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">k</span><span class="p">,</span> <span class="n">v</span><span class="o">|</span>
<span class="n">commands</span><span class="o">[</span><span class="s2">"user:</span><span class="si">#{</span><span class="vi">@id</span><span class="si">}</span><span class="s2">:</span><span class="si">#{</span><span class="n">k</span><span class="si">}</span><span class="s2">"</span><span class="o">]</span> <span class="o">=</span> <span class="n">v</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">inspect</span>
<span class="s2">"<#User:</span><span class="si">#{</span><span class="vi">@id</span><span class="si">}</span><span class="s2"> </span><span class="si">#{</span><span class="vi">@fields</span><span class="o">.</span><span class="n">map</span> <span class="p">{</span> <span class="o">|</span><span class="n">k</span><span class="p">,</span> <span class="n">v</span><span class="o">|</span> <span class="s2">"</span><span class="si">#{</span><span class="n">k</span><span class="si">}</span><span class="s2">:</span><span class="si">#{</span><span class="n">v</span><span class="o">.</span><span class="n">inspect</span><span class="si">}</span><span class="s2">"</span> <span class="si">}</span><span class="s2">.join(' ')}>"</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="no">User</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="c1"># => <#User:1 username:"JEG2" password:"secret"></span>
<span class="n">new_guy</span> <span class="o">=</span> <span class="no">User</span><span class="o">.</span><span class="n">new</span>
<span class="n">new_guy</span><span class="o">.</span><span class="n">username</span> <span class="o">=</span> <span class="s2">"New Guy"</span>
<span class="n">new_guy</span><span class="o">.</span><span class="n">password</span> <span class="o">=</span> <span class="s2">"123"</span>
<span class="n">new_guy</span><span class="o">.</span><span class="n">save</span>
<span class="no">User</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="n">new_guy</span><span class="o">.</span><span class="n">id</span><span class="p">)</span> <span class="c1"># => <#User:31 username:"New Guy" password:"123"></span>
</pre></div>
<p>I snuck in a another feature in my implementation of <code>save()</code> for that example: pipelined commands. If you're going to issue a bunch of commands real quick, as I did with the field saves in this case, you can <em>pipeline</em> them. This queues them up locally and then fires them all at the Redis server as your block exits. This can make those batch operations a little more efficient.</p>
<h4>Saving and Shutting Down</h4>
<p>Redis does send periodic snapshot data backups to disk, unlike memcached. I've already talked about how <a href="/key-value-stores/setting-up-the-redis-server">you can configure exactly when these backups happen</a> on the server side, but you can also request a snapshot from the client side. Just call <code>bgsave()</code> to trigger the usual asynchronous save or <code>save()</code> if you would prefer a synchronous backup.</p>
<p>When you are done playing around with a Redis session, you can call the <code>shutdown()</code> method to close all connections, dump the database to disk, and exit the server. If you don't wish to keep the data, you can call <code>flush_db()</code> to ditch the data in the database you are connected to. You may also wish to examine the statistics from a call to <code>info()</code> before you <code>shutdown()</code> to see what work the server has done.</p>
<p>That covers basic key-value store usage. However, Redis has some unique features that really set it apart from other key-value stores. We will look into those next.</p>James Edward Gray IISetting up the Redis Servertag:graysoftinc.com,2009-09-14:/posts/882014-04-18T21:16:36ZSome tips for installing, configuring, and running the Redis server.<p>Before we can play with <a href="http://redis.io/">Redis</a>, you will need to get the server running locally. Luckily, that's very easy.</p>
<h4>Installing Redis</h4>
<p>Building Redis is a simple matter of grabbing the code and compiling it. Once built, you can place the executables in a convenient location in your <code>PATH</code>. On my box, I can do all of that with these commands:</p>
<pre><code>curl -O http://redis.googlecode.com/files/redis-1.0.tar.gz
tar xzvf redis-1.0.tar.gz
cd redis-1.0
make
sudo cp redis-server redis-cli redis-benchmark /usr/local/bin
</code></pre>
<p>Those commands build version 1.0 of the server, which is the current stable release as of this writing. You may need to adjust the version numbers down the road to get the latest releases though.</p>
<p>I also copied the executables to where I prefer to have them: <code>/usr/local/bin</code>. Feel free to change that directory in the last command to whatever you prefer.</p>
<p>If you will be talking to Redis from Ruby, as I will show in all of my examples, you are going to need a client library. I recommend <a href="https://github.com/redis/redis-rb">Ezra Zygmuntowicz's redis-rb</a>. You can install that gem with:</p>
<pre><code>gem install redis
</code></pre>
<h4>Running and Configuring Redis</h4>
<p>That's it for the install. Launching the server is even easier. The pattern is just:</p>
<pre><code>redis-server path/to/redis.conf
</code></pre>
<p>The argument is the path to the configuration file that tells Redis how you want it to behave. There's <a href="https://github.com/antirez/redis/blob/unstable/redis.conf">a sample configuration file in the Redis source code</a> that shows the options.</p>
<p>I'm not going to discuss all of the configuration options. They are already well commented in the sample file. However, I do want to mention a few things.</p>
<p>First, if you will only be connecting to a local Redis instance, uncomment the <code>bind</code> configuration in the sample file:</p>
<pre><code>bind 127.0.0.1
</code></pre>
<p>That tells Redis not to listen for external connections.</p>
<p>If you do need to accept external connections, you may want to set a limit for the number of simultaneous connections to avoid exhausting the file descriptors on your server. You can also adjust the timeout for inactive connections to reclaim those resources:</p>
<pre><code>maxclients 128
timeout 60
</code></pre>
<p>There are some other non-network limits you may wish to fiddle with as well.</p>
<p>By default, Redis supports multiple databases. It even has a <code>move()</code> command that allows you to transfer keys between databases. I find I usually only want one though. I'm more likely to launch multiple servers if I want more. This would allow me to control resources, like memory consumption, on a per database basis, at the cost of losing the atomic <code>move()</code>. Even if I didn't just want one database though, I think it would be rare to need the 16 that are configured by default. You can easily turn that down:</p>
<pre><code>databases 1
</code></pre>
<p>Another limit you may want to consider setting is the maximum memory limit. If you plan to use Redis as a <a href="http://www.danga.com/memcached/">memcached</a> replacement, you will likely wish to control how much memory it can consume. You can set the maximum number of bytes Redis can allocate, after which it will start purging volatile keys. If it cannot reclaim any more memory it will start refusing write commands. Here's a sample setting for a 100MB limit:</p>
<pre><code>maxmemory 104857600
</code></pre>
<p>Note that the above setting is really only a good idea when using Redis as a cache. If you are using it as a general database, you will need to monitor its memory consumption and take action before too many resources are consumed.</p>
<p>The last setting I want to talk about is probably the most important for using Redis.</p>
<p>The server will periodically fork and asynchronously dump the current contents of the database to disk. The dump is actually made to a temporary file and then moved to replace any older dump, so the operation is atomic and won't leave you with a partially dumped database. If Redis is eventually shutdown and reloaded, it will restore from this dump file.</p>
<p>How often it dumps the keys is configureable by the amount of time that passes and the number of changes that have been made to the data. For example, the following settings tell Redis to dump the database after 60 seconds if 100 changes have been made or after five minutes if there has been at least 1 change:</p>
<pre><code>save 300 1
save 60 100
</code></pre>
<p>As you can see, you can set several different conditions. As soon as any one line of conditions matches, meaning both the time and the changes much match, the database is dumped and both counts restart.</p>
<p>Note that the time condition can be met before the changes. This means that, using the settings above, I can launch a Redis server, let it sit for five or more minutes, and then change a single key to trigger an immediate dump. The time will have already passed and as soon as I make the changes condition true as well, that is enough. In other words, I don't have to wait five minutes after I make the change.</p>
<p>That covers plenty about installing and running the Redis server. You are now all set to play with it.</p>James Edward Gray IIUsing Key-Value Stores From Rubytag:graysoftinc.com,2009-09-14:/posts/872014-04-18T21:01:21ZThe table of contents for a series of posts about working with some popular key-value stores from Ruby code.<p>I've been playing with a few different key-value stores recently. My choices are pretty popular and you can find documentation for them. However, it can still be a bit of work to relate everything to Ruby specific usage, which is what I care about. Given that, here are my notes on the systems I've used.</p>
<h4>Redis</h4>
<ol>
<li><a href="/key-value-stores/setting-up-the-redis-server">Setting up the Redis Server</a></li>
<li><a href="/key-value-stores/using-redis-as-a-key-value-store">Using Redis as a Key-Value Store</a></li>
<li><a href="/key-value-stores/lists-and-sets-in-redis">Lists and Sets in Redis</a></li>
<li><a href="/key-value-stores/where-redis-is-a-good-fit">Where Redis is a Good Fit</a></li>
</ol><h4>Tokyo Cabinet, Tokyo Tyrant, and Tokyo Dystopia</h4>
<ol>
<li>Installing the Tokyo Software</li>
<li><a href="/key-value-stores/tokyo-cabinet-as-a-key-value-store">Tokyo Cabinet as a Key-Value Store</a></li>
<li><a href="/key-value-stores/tokyo-cabinets-key-value-database-types">Tokyo Cabinet's Key-Value Database Types</a></li>
<li>Tokyo Cabinet's Tables</li>
<li>Threads and Multiprocessing With Tokyo Cabinet</li>
<li>Tokyo Tyrant as a Network Interface</li>
<li>The Strengths of Tokyo Cabinet</li>
</ol>James Edward Gray II