TensorFlow Object Detection in Ruby

The ONNX Runtime gem makes it easy to run TensorFlow models in Ruby. This short tutorial will show you how. It’s based on this tutorial from tf2onnx.

We’ll use SSD Mobilenet, which can detect multiple objects in an image.

First, download the pretrained model from the official TensorFlow Models project and this awesome shot of polar bears.

Bears

Photo from the U.S. Fish and Wildlife Service

Install tf2onnx

pip install tf2onnx

And convert the model to ONNX

python -m tf2onnx.convert --opset 10 \
  --saved-model ssd_mobilenet_v1_coco_2018_01_28/saved_model \
  --output model.onnx

Next, install the ONNX Runtime and MiniMagick gems

gem "onnxruntime"
gem "mini_magick"

Load the image

img = MiniMagick::Image.open("bears.jpg")
pixels = img.get_pixels

And the model

model = OnnxRuntime::Model.new("model.onnx")

Check the model inputs

p model.inputs

The shape is [-1, -1, -1, 3]. -1 indicates any size. pixels has the shape [img.height, img.width, 3]. The model is designed to process multiple images at once, which is where the final dimension comes from.

Let’s run the model:

result = model.predict({"inputs" => [pixels]})

The model gives us a number of different outputs, like the number of detections, labels, scores, and boxes. Let’s print the results:

p result["num_detections"]
# [3.0]
p result["detection_classes"]
# [[23.0, 23.0, 88.0, 1.0, ...]]

We can see there were three detections, and if we look at the first three elements in the detection classes array, they are the numbers 23, 23, and 88. These correspond to COCO labels. We can look these up and see that 23 is bear and 88 is teddy bear. Mostly right!

With a bit more code, we can apply boxes and labels to the image.

coco_labels = {
  23 => "bear",
  88 => "teddy bear"
}

def draw_box(img, label, box)
  width, height = img.dimensions

  # calculate box
  thickness = 2
  top = (box[0] * height).round - thickness
  left = (box[1] * width).round - thickness
  bottom = (box[2] * height).round + thickness
  right = (box[3] * width).round + thickness

  # draw box
  img.combine_options do |c|
    c.draw "rectangle #{left},#{top} #{right},#{bottom}"
    c.fill "none"
    c.stroke "red"
    c.strokewidth thickness
  end

  # draw text
  img.combine_options do |c|
    c.draw "text #{left},#{top - 5} \"#{label}\""
    c.fill "red"
    c.pointsize 18
  end
end

result["num_detections"].each_with_index do |n, idx|
  n.to_i.times do |i|
    label = result["detection_classes"][idx][i].to_i
    label = coco_labels[label] || label
    box = result["detection_boxes"][idx][i]
    draw_box(img, label, box)
  end
end

# save image
img.write("labeled.jpg")

And the result:

Bears Labeled

Here’s the complete code. Now go out and try it with your own images!

Updates

Published August 28, 2019


You might also enjoy

Score Almost Any Machine Learning Model in Ruby

Emotion Recognition in Ruby

Artistic Style Transfer in Ruby


All code examples are public domain.
Use them however you’d like (licensed under CC0).