Key-Value Stores

Notes from my learning about simple NoSQL storage solutions.

15

SEP
2009

Using Redis as a Key-Value Store

[Update: though all of the techniques I show here still apply, many methods of the Redis gem have changed names to match the actual Redis commands they call.]

Redis is a first and foremost a server providing key-value storage. As such, the primary features of any client library are for connecting to the server and manipulating those key-value pairs.

Connecting to the Server

Connecting to the Redis server can be as simple as Redis.new, thanks to some defaults in both the server and Ezra's Ruby client library for talking to that server. I won't pass any options to the constructor calls below, but you can use any of the following as needed:

  • :host if you need to connect to an external host instead of the default 127.0.0.1
  • :port if you need to use something other than the default port of 6379
  • :password if you configured Redis to require a password on connection
  • :db if you want to select one of the multiple configured databases, other than the default of 0 (databases are identified by a zero-based index)
  • :timeoeut if you want a different timeout for Redis communication than the default of 5 seconds
  • :logger if you want the library to log activity as it works

Getting and Setting Keys

Once connected to, Redis can be used as an in-memory key-value store, much like memcached. The client library exposes this key getting and setting functionality just like a Ruby Hash:

#!/usr/bin/env ruby -wKU

require "redis"

db = Redis.new
db[:my_key]  # => nil
db[:my_key] = "my_value"
db[:my_key]  # => "my_value"
db.delete(:my_key)
db[:my_key]  # => nil

Notice that we can read, write, and delete key-value pairs just as we could with a Hash, using [], []=, and delete() respectively. If we look for a key that isn't in the database, we get nil just as Ruby would give us for a Hash.

There are two other methods for slightly more advanced setting operations. First, getset() can be used to update a value while also retrieving its previous value. There's also a set_unless_exists() method that will not override an existing value. Here are those operations in action:

#!/usr/bin/env ruby -wKU

require "redis"

db       = Redis.new
db[:adv] = "old"

db.getset(:adv, "new")              # => "old"
db[:adv]                            # => "new"
db.set_unless_exists(:adv, "lost")  # => false
db[:adv]                            # => "new"

Other Hash-like operations are supported. For example, you can check for the existence of a key, count the number of keys in the database, fetch a list of keys matching a pattern, or even get random keys:

#!/usr/bin/env ruby -wKU

require "redis"

db             = Redis.new
db[:key1]      = 1
db[:key2]      = 2
db[:key3]      = 3
db[:other_key] = "other"

db.key?(:key3)   # => 1
db.key?(:key4)   # => false
db.dbsize        # => 4
db.keys("key*")  # => ["key2", "key3", "key1"]
db.randkey       # => "key2"
db.randkey       # => "other_key"

Note that the pattern passed to keys() is similar to a file glob. You can use a ? in the pattern to mean any one character and * to match any run of characters. You can also use \\ to escape these special characters and match them literally.

It's worth noting that all Redis keys and values are pretty much Strings:

#!/usr/bin/env ruby -wKU

require "redis"

db             = Redis.new
db[Object.new] = Object.new

k = db.keys("#*").first  # => "#<Object:0x301894>"
db[k]                    # => "#<Object:0x301858>"

There are some minor exceptions where values can sometimes be treated as numbers. You can also have collections in values, but the collections hold the typical Strings, with sometimes numeric meaning. I'll talk more about these cases later.

Key Expiration

Redis supports setting expiration times on stored keys. When that time expires, the key will be purged. This is very useful when using Redis as a cache. You can set an expiration time by calling the expire() method, or you can just use a different version of the key setting that includes the timeout:

#!/usr/bin/env ruby -wKU

require "redis"

db = Redis.new
db.set(:cached, "short lived", 3)

4.times do
  sleep 1
  puts "db[:cached] is #{db[:cached].inspect} at #{Time.now}"
end
# >> db[:cached] is "short lived" at Sat Sep 05 14:01:07 -0500 2009
# >> db[:cached] is "short lived" at Sat Sep 05 14:01:08 -0500 2009
# >> db[:cached] is "short lived" at Sat Sep 05 14:01:09 -0500 2009
# >> db[:cached] is nil at Sat Sep 05 14:01:10 -0500 2009

Note that a write operation against a key with an expiration timeout set, a volatile key in Redis parlance, clears the timeout. You can use the ttl() method if you need to examine the time to live for a key. There's also a matching get() method to go with the set() I used above, though it's just an alias for [] and has nothing to do with timeouts.

Counters

Redis supports some other interesting operations on simple keys. For example, you can use the atomic incr() operation to manage globally unique ID's:

#!/usr/bin/env ruby -wKU

require "redis"

3.times do
  fork do
    db  = Redis.new
    ids = Array.new(10) { db.incr("global:next_user_id") }
    puts "#{Process.pid}: #{ids.join(', ')}"
  end
end

Process.waitall
# >> 1148: 1, 3, 6, 9, 12, 15, 17, 22, 25, 27
# >> 1147: 4, 8, 11, 13, 16, 18, 20, 23, 26, 29
# >> 1149: 2, 5, 7, 10, 14, 19, 21, 24, 28, 30

This is one of the exceptions I mentioned earlier where Redis will try to treat a value as a number. In this case an Integer is expected and a Float will be truncated. If it holds non-numeric content, it is set to "0" and then modified as requested. That's why you can start with a key that doesn't exist, as I did above.

There is a matching decr() operation. You can also choose to pass an Integer as the second argument to these methods to raise or lower the count by that amount.

Getting and Setting Multiple Keys at Once

Another interesting feature is the ability to fetch more than one key at a time:

#!/usr/bin/env ruby -wKU

require "redis"

db                    = Redis.new
db["user:1:username"] = "JEG2"
db["user:1:password"] = "secret"

db.mget("user:1:username", "user:1:password")  # => ["JEG2", "secret"]

We can tie the counter and multiple get features together to do some basic object storage inside Redis:

#!/usr/bin/env ruby -wKU

require "redis"

DB = Redis.new

class User
  def initialize(id = nil)
    @id     = id
    @fields = Hash.new
    load if @id
  end

  attr_reader :id

  def method_missing(meth, *args, &blk)
    if meth.to_s =~ /\A(\w+)=/
      @fields[$1] = args.first
    else
      @fields[meth]
    end
  end

  def load
    keys    = DB.keys("user:#{@id}:*")
    values  = DB.mget(*keys)
    @fields = Hash[*keys.map { |k| k[/\w+\z/] }.zip(values).flatten]
  end

  def save
    @id ||= DB.incr("global:next_user_id")
    DB.pipelined do |commands|
      @fields.each do |k, v|
        commands["user:#{@id}:#{k}"] = v
      end
    end
  end

  def inspect
    "<#User:#{@id} #{@fields.map { |k, v| "#{k}:#{v.inspect}" }.join(' ')}>"
  end
end

User.new(1)  # => <#User:1 username:"JEG2" password:"secret">

new_guy = User.new
new_guy.username = "New Guy"
new_guy.password = "123"
new_guy.save

User.new(new_guy.id)  # => <#User:31 username:"New Guy" password:"123">

I snuck in a another feature in my implementation of save() for that example: pipelined commands. If you're going to issue a bunch of commands real quick, as I did with the field saves in this case, you can pipeline them. This queues them up locally and then fires them all at the Redis server as your block exits. This can make those batch operations a little more efficient.

Saving and Shutting Down

Redis does send periodic snapshot data backups to disk, unlike memcached. I've already talked about how you can configure exactly when these backups happen on the server side, but you can also request a snapshot from the client side. Just call bgsave() to trigger the usual asynchronous save or save() if you would prefer a synchronous backup.

When you are done playing around with a Redis session, you can call the shutdown() method to close all connections, dump the database to disk, and exit the server. If you don't wish to keep the data, you can call flush_db() to ditch the data in the database you are connected to. You may also wish to examine the statistics from a call to info() before you shutdown() to see what work the server has done.

That covers basic key-value store usage. However, Redis has some unique features that really set it apart from other key-value stores. We will look into those next.

Comments (2)
  1. Pablo
    Pablo August 13th, 2010 Reply Link

    Thanks for the writeup, it's exactly what I was looking for.
    We're probably switching from Memcached to Redis for the manual expiration options it has.

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
  2. viresh
    viresh July 5th, 2012 Reply Link

    awesome tutorial :) u cld have added "Queue" ( using Redis to queue jobs )

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
Leave a Comment (using GitHub Flavored Markdown)

Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

Ajax loader