# Numo: NumPy for Ruby

Photo by Jonas Svidras

NumPy is an extremely popular library for machine learning in Python. It provides an efficient way to work with large, multi-dimensional arrays. What you may not know is Ruby has a library with similar functionality. It’s called Numo, and in this post, we’ll look at what you can do with it.

## Basic Operations

Numo’s core data structure is the multi-dimensional array, which has methods for mathematical operations. These operations are written in C, so they’re much faster than performing the same operations in Ruby.

Let’s start by creating a Numo array from a Ruby array.

```
x = Numo::DFloat.cast([[1, 2, 3], [4, 5, 6]])
```

Each array has shape. We created a 2x3 2D array, but arrays can be 1D, 3D, or more.

```
x.shape # [2, 3]
```

Read a row or column with:

```
x[0, true] # 1st row - [1, 2, 3]
x[true, 2] # 3rd column - [3, 6]
```

We can add a constant value:

```
x + 2 # [[3, 4, 5], [6, 7, 8]]
```

Or add arrays:

```
x + x # [[2, 4, 6], [8, 10, 12]]
```

Some operations like mean and sum can be run over a specific axis.

```
x.sum(0) # sum of each column - [5, 7, 9]
x.mean(1) # mean of each row - [2, 5]
```

We can also change its shape - useful for preparing data for models.

```
x.reshape(3, 2) # [[1, 2], [3, 4], [5, 6]]
```

If you’re familiar with NumPy operations, there are side-by-side examples and a table showing how the functions map.

## Building Models

Rumale is a machine learning library similar to Python’s Scikit-learn. It uses Numo for inputs and outputs. Here’s a basic example of linear regression.

```
# generate data: y = 1 + 2(x0) + 3(x1)
x = Numo::DFloat.asarray([[0, 1], [1, 0], [1, 2]])
y = 1 + 2 * x[true, 0] + 3 * x[true, 1]
# train
model = Rumale::LinearModel::LinearRegression.new(
fit_bias: true, max_iter: 10000)
model.fit(x, y)
# predict
model.predict(x)
```

Rumale has many, many models and other useful tools for:

- Regression: linear, ridge, lasso, support vector machines
- Classification: logistic regression, naive Bayes, K-nearest neighbors, support vector machines
- Clustering: K-means, Gaussian mixture model
- Dimensionality reduction: principal component analysis

Scikit-learn has a great cheat-sheet to help you decide what do use:

Image from Scikit-learn (BSD License)

## Storing Data

Numo arrays can be marshaled just like other Ruby objects. This allows you to save your work and resume it at a later time.

```
# save
File.binwrite("x.dump", Marshal.dump(x))
# load
x = Marshal.load(File.binread("x.dump"))
```

Npy allows you to save and load arrays in the same format as NumPy. This is more performant than marshaling.

```
# save
Npy.save("x.npy", x)
# load
x = Npy.load("x.npy")
```

It also makes it easy to load datasets like MNIST.

```
mnist = Npy.load_npz("mnist.npz")
```

## Summary

You now have a basic introduction to Numo and know how to:

- perform basic operations
- build a model
- store data

Consider Numo for your next machine learning project.