Scaling Reads

Note: This approach is now packaged into a gem gem


One of the easier ways to scale your database is to distribute reads to replicas.

Desire

Here’s the desired behavior:

User.find(1)                  # primary

distribute_reads do
  # use replica for reads
  User.maximum(:visits_count) # replica
  User.find(2)                # replica

  # until a write
  # then switch to primary
  User.create!                # primary
  User.last                   # primary
end

Contenders

We looked at a number of libraries, including Octopus, Octoshark, and Replica Pools.

The winner was Makara - it handles failover well and has a simple configuration.

Getting Started

First, install Makara.

gem 'makara'

There are 3 important ENV variables in our setup.

Here are sample values:

DATABASE_URL=postgres://nerd:secret@localhost:5432/db_development
REPLICA_DATABASE_URL=postgres://nerd:secret@localhost:5432/db_development
MAKARA=true

Next, update config/database.yml.

development: &default
  <% if ENV["MAKARA"] %>
  url: postgresql-makara:///
  makara:
    sticky: true
    connections:
      - role: master
        name: primary
        url: <%= ENV["DATABASE_URL"] %>
      - name: replica
        url: <%= ENV["REPLICA_DATABASE_URL"] %>
  <% else %>
  adapter: postgresql
  url: <%= ENV["DATABASE_URL"] %>
  <% end %>

production:
  <<: *default

We don’t use the middleware, so we remove it by adding to config/application.rb:

config.middleware.delete Makara::Middleware

Also, we want to read from primary by default so have to patch Makara. Create an initializer config/initializers/makara.rb with:

Makara::Cache.store = :noop

module DefaultToPrimary
  def _appropriate_pool(*args)
    return @master_pool unless Thread.current[:distribute_reads]
    super
  end
end

Makara::Proxy.send :prepend, DefaultToPrimary

module DistributeReads
  def distribute_reads
    previous_value = Thread.current[:distribute_reads]
    begin
      Thread.current[:distribute_reads] = true
      Makara::Context.set_current(Makara::Context.generate)
      yield
    ensure
      Thread.current[:distribute_reads] = previous_value
    end
  end
end

Object.send :include, DistributeReads

To distribute reads, use:

total_users = distribute_reads { User.count }

You can also put multiple lines in a block.

distribute_reads do
  User.max(:visits_count)
  Order.sum(:revenue_cents)
  Visit.average(:duration)
end

Test Drive

In the Rails console, run:

User.first                       # primary
distribute_reads { User.last }   # replica

heart Happy scaling

Published March 31, 2015


You might also enjoy

Anonymizing IPs in Ruby

Adding CSP to Rails

Trying Out Vault for Postgres Credentials


All code examples are public domain.
Use them however you’d like (licensed under CC0).