Joey Aghion

Half a Mind

by Joey Aghion

Profile Redis memory usage by key pattern

At Weplay, we use Redis quite a bit. As a Redis database grows, though, it can be difficult to understand how the space is being used. This script tries to identify which types of keys are occupying the most space in your Redis databases. To do so, it takes a random sample of keys and collapses them into similar patterns (basically by ignoring the numeric values).

Sample output:

Profiling "db0"...
- - User:#:cached_preferences
  - keys: 1715
    size: 36015
    percent: 54.11%
- - Poll:#:results
  - keys: 555
    size: 27750
    percent: 41.69%
- - Group:#:member_ids
  - keys: 1645
    size: 1645
    percent: 2.47%
- - User:#:cached_follower_ids
  - keys: 577
    size: 1154
    percent: 1.73%

Overall statistics:

The script:

#!/usr/bin/env ruby

# Evaluates a sample of keys/values from each redis database, computing statistics for each key pattern:
#   keys: number of keys matching the given pattern
#   size: approximation of the associated memory occupied (based on size/length of value)
#   percent: the proportion of this 'size' relative to the sample's total
# Copyright Weplay, Inc. 2010. Available for use under the MIT license.

require 'rubygems'
require 'redis'
require 'yaml'

SAMPLE_SIZE = 10_000  # number of keys to sample from each db before computing stats

# Naive approximation of memory footprint: size/length of value.
def redis_size(db, k)
  t = db.type(k)
  case t
    when 'string' then db.get(k).length
    when 'list'   then db.lrange(k, 0, -1).size
    when 'zset'   then db.zrange(k, 0, -1).size
    when 'set'    then db.smembers(k).size
    else raise("Redis type '#{t}' not yet supported.")  # TODO accommodate more types

def array_sum(array)
  array.inject(0){ |sum, e| sum + e }

def redis_db_profile(db_name, sample_size = SAMPLE_SIZE)
  db = => db_name)
  keys = []
  sample_size.times { |i| keys << db.randomkey }
  key_patterns = keys.group_by{ |key| key.gsub(/\d+/, '#') }
  data ={ |pattern, keys|
    [pattern, {'keys' => keys.size, 'size' => array_sum({ |k| redis_size(db, k) })}]
  }.sort_by{ |a| a.last['size'] }.reverse
  size_sum = data.inject(0){|sum, d| sum += d.last['size'] }
  data.each { |d| d.last['percent'] = '%.2f%' % (d.last['size'].to_f*100/size_sum) }

db_names = `redis-cli info | grep ^db[0-9]`{ |line| line.scan(/^db\d+/).first }
db_names.each do |name|
  puts "\nProfiling \"#{name}\"...\n#{'-'*20}"
  y redis_db_profile(name)

puts "\nOverall statistics:\n#{'-'*20}"
puts `redis-cli info | grep memory`