This post describes the projects I’ve worked on related to training and implementing objected detection models for various applications. This post also describes the sofwtare tools I’ve leveraged to facilitate the training process, and the technical skills I’ve improved on and acquired while working on these projects.

Surgical Instruments

As part of our technical development at RIF Robotics, we are training a machine learning vision-based model to detect and identify surgical instruments. Our goal at RIF is for a robotic manipulator to autonomously manipulate any instrument. Thus, we need to train an image segmentation model (as opposed to typical object detection) such that our software can also extract the instrument’s pose. The resulting pose is what will be fed into our motion planning algorihtms for autonomus pick and place.

We have trained an instance segmentation model using Facebook’s Detectron2 library. The following video shows the performance of the initial model in real-time:

Surgical instruments are particularly hard to differentiate given the very similar features that a lot of pairs have between them. As such, we’re working on defining a framework that can handle the thousands of different types of instruments.

Stay tuned for future progress!

PyTorch in Docker

Detectron2 is built on top of PyTorch, an open source machine learning framework. We have setup a Docker environment for running PyTorch. It faciliates the training and testing processes.

Feel free to contact me if you’re interested in setting up a similar environment for your project: Consulting Services

Hobby Projects

At the time of writing this post, I am cleaning up the descriptions of my hobby projects. Updates for this section are coming soon.

Software Tools

CVAT

Annotating images is always a laborious and time-consuming process. This is especially true if you’re making annotations for image segmentation models since you have to carefully trace the outline of the objects of interest instead of just dragging a bounding box around the objects.

The Computer Vision Annotation Tool (CVAT) is the tool of my choice because its easy-to-use interface. Moreover, they provide a Docker environment such that developers don’t have to worry about installation steps.

At RIF, we have written some Python scripts to facilitate the annotation process for image segmentation even further. Instead of dragging your mouse around each object repeteadly, the scripts implement several classic computer vision methodologies to extract the outlines of the objects of interest and generate a good-enough starting point such that you’re just fixing the initial estimates.

Feel free to contact me if you’re interested in setting up a similar environment for your project: Consulting Services

FiftyOne

As you annotate images and train your models, you need to be able to visualize your progress. FiftyOne is an open source tool that facilitates the generation and management of your datasets.

The Python scripts we’ve written at RIF also interact with FiftyOne providing a seamless solution between image annotation and model training.

Technical Skills

The following summarizes my proficiency in the technical skills I improved on and acquired while working on the projects described in this post:

Python	Expert
Docker	Proficient
CVAT: Image Annotation	Proficient
PyTorch	Competent
TensorFlow	Novice