Numo: NumPy for Ruby

Numo

Photo by Jonas Svidras

NumPy is an extremely popular library for machine learning in Python. It provides an efficient way to work with large, multi-dimensional arrays. What you may not know is Ruby has a library with similar functionality. It’s called Numo, and in this post, we’ll look at what you can do with it.

Basic Operations

Numo’s core data structure is the multi-dimensional array, which has methods for mathematical operations. These operations are written in C, so they’re much faster than performing the same operations in Ruby.

Let’s start by creating a Numo array from a Ruby array.

x = Numo::DFloat.cast([[1, 2, 3], [4, 5, 6]])

Each array has shape. We created a 2x3 2D array, but arrays can be 1D, 3D, or more.

x.shape # [2, 3]

Read a row or column with:

x[0, true] # 1st row - [1, 2, 3]
x[true, 2] # 3rd column - [3, 6]

We can add a constant value:

x + 2 # [[3, 4, 5], [6, 7, 8]]

Or add arrays:

x + x # [[2, 4, 6], [8, 10, 12]]

Some operations like mean and sum can be run over a specific axis.

x.sum(0)  # sum of each column - [5, 7, 9]
x.mean(1) # mean of each row - [2, 5]

We can also change its shape - useful for preparing data for models.

x.reshape(3, 2) # [[1, 2], [3, 4], [5, 6]]

If you’re familiar with NumPy operations, there are side-by-side examples and a table showing how the functions map.

Building Models

Rumale is a machine learning library similar to Python’s Scikit-learn. It uses Numo for inputs and outputs. Here’s a basic example of linear regression.

# generate data: y = 1 + 2(x0) + 3(x1)
x = Numo::DFloat.asarray([[0, 1], [1, 0], [1, 2]])
y = 1 + 2 * x[true, 0] + 3 * x[true, 1]

# train
model = Rumale::LinearModel::LinearRegression.new(
          fit_bias: true, max_iter: 10000)
model.fit(x, y)

# predict
model.predict(x)

Rumale has many, many models and other useful tools for:

Scikit-learn has a great cheat-sheet to help you decide what do use:

Scikit-learn Cheat Sheet

Image from Scikit-learn (BSD License)

Storing Data

Numo arrays can be marshaled just like other Ruby objects. This allows you to save your work and resume it at a later time.

# save
File.binwrite("x.dump", Marshal.dump(x))

# load
x = Marshal.load(File.binread("x.dump"))

Npy allows you to save and load arrays in the same format as NumPy. This is more performant than marshaling.

# save
Npy.save("x.npy", x)

# load
x = Npy.load("x.npy")

It also makes it easy to load datasets like MNIST.

mnist = Npy.load_npz("mnist.npz")

Summary

You now have a basic introduction to Numo and know how to:

Consider Numo for your next machine learning project.

Published September 17, 2019


You might also enjoy

Ruby ML for Python Coders

15 More ML Gems for Ruby

Rails, Meet Data Science


All code examples are public domain.
Use them however you’d like (licensed under CC0).