DVC on Heroku

DVC is a version control system for machine learning datasets and models. It allows you to store large files outside of Git while keeping them versioned. In a few steps, you can have it working on Heroku.

This tutorial assumes you’re already using DVC and are ready to deploy your app to Heroku.

Getting Started

Use the Apt buildpack from Heroku to install DVC.

heroku buildpacks:add --index 1 heroku-community/apt

Create an Aptfile with the latest release:


Next, add your storage credentials. We recommend an Amazon S3 bucket in the same region as your Heroku dynos (us-east-1 by default) to avoid paying for data transfer, which is free in-region:


Finally, configure your application to run the following commands:

dvc config core.no_scm true
dvc pull
rm -r .dvc .apt/usr/lib/dvc # reduces slug size

You can run them at build time or runtime. We recommend build time unless it causes your app to exceed the maximum slug size (500 MB compressed).

With Django, add to settings.py:

if "DYNO" in os.environ and os.path.isdir(".dvc"):
    print("Running DVC")
    os.system("dvc config core.no_scm true")
    if os.system("dvc pull") != 0:
        exit("Pull failed")
    os.system("rm -r .dvc .apt/usr/lib/dvc")

With Rails, create an initializer with:

if Rails.env.production? && Dir.exist?(".dvc")
  puts "Running DVC"
  system "dvc config core.no_scm true"
  system "dvc pull" or abort "Pull failed"
  system "rm -r .dvc .apt/usr/lib/dvc"

Your DVC files are now available on Heroku tada

Published November 7, 2020

You might also enjoy

Git LFS on Heroku

Rails, Meet Data Science

Score Almost Any Machine Learning Model in Ruby

All code examples are public domain.
Use them however you’d like (licensed under CC0).