Scaling Reads

Note: This approach is now packaged into a gem gem


One of the easier ways to scale your database is to distribute reads to replicas.

Desire

Here’s the desired behavior:

User.find(1)                  # primary

distribute_reads do
  # use replica for reads
  User.maximum(:visits_count) # replica
  User.find(2)                # replica

  # until a write
  # then switch to primary
  User.create!                # primary
  User.last                   # primary
end

Contenders

We looked at a number of libraries, including Octopus, Octoshark, and Replica Pools.

The winner was Makara - it handles failover well and has a simple configuration.

Getting Started

First, install Makara.

gem 'makara'

There are 3 important ENV variables in our setup.

Here are sample values:

DATABASE_URL=postgres://nerd:secret@localhost:5432/db_development
REPLICA_DATABASE_URL=postgres://nerd:secret@localhost:5432/db_development
MAKARA=true

Next, update config/database.yml.

development: &default
  <% if ENV["MAKARA"] %>
  url: postgresql-makara:///
  makara:
    sticky: true
    connections:
      - role: master
        name: primary
        url: <%= ENV["DATABASE_URL"] %>
      - name: replica
        url: <%= ENV["REPLICA_DATABASE_URL"] %>
  <% else %>
  adapter: postgresql
  url: <%= ENV["DATABASE_URL"] %>
  <% end %>

production:
  <<: *default

We don’t use the middleware, so we remove it by adding to config/application.rb:

config.middleware.delete Makara::Middleware

Also, we want to read from primary by default so have to patch Makara. Create an initializer config/initializers/makara.rb with:

Makara::Cache.store = :noop

module DefaultToPrimary
  def _appropriate_pool(*args)
    return @master_pool unless Thread.current[:distribute_reads]
    super
  end
end

Makara::Proxy.send :prepend, DefaultToPrimary

module DistributeReads
  def distribute_reads
    previous_value = Thread.current[:distribute_reads]
    begin
      Thread.current[:distribute_reads] = true
      Makara::Context.set_current(Makara::Context.generate)
      yield
    ensure
      Thread.current[:distribute_reads] = previous_value
    end
  end
end

Object.send :include, DistributeReads

To distribute reads, use:

total_users = distribute_reads { User.count }

You can also put multiple lines in a block.

distribute_reads do
  User.max(:visits_count)
  Order.sum(:revenue_cents)
  Visit.average(:duration)
end

Test Drive

In the Rails console, run:

User.first                       # primary
distribute_reads { User.last }   # replica

heart Happy scaling

Published March 31, 2015


You might also enjoy

Anonymizing IPs in Ruby

Google OAuth with Devise

Client-Side Encryption with AWS and Ruby


All code examples are public domain.
Use them however you’d like (licensed under CC0).