Object Detection
This post describes the projects I’ve worked on related to training and implementing objected detection models for various applications. This post also describes the sofwtare tools I’ve leveraged to facilitate the training process, and the technical skills I’ve improved on and acquired while working on these projects.
Surgical Instruments
As part of our technical development at RIF Robotics, we are training a machine learning vision-based model to detect and identify surgical instruments. Our goal at RIF is for a robotic manipulator to autonomously manipulate any instrument. Thus, we need to train an image segmentation model (as opposed to typical object detection) such that our software can also extract the instrument’s pose. The resulting pose is what will be fed into our motion planning algorihtms for autonomus pick and place.
We have trained an instance segmentation model using Facebook’s Detectron2 library. The following video shows the performance of the initial model in real-time:
Surgical instruments are particularly hard to differentiate given the very similar features that a lot of pairs have between them. As such, we’re working on defining a framework that can handle the thousands of different types of instruments.
Stay tuned for future progress!
PyTorch in Docker
Detectron2 is built on top of PyTorch, an open source machine learning framework. We have setup a Docker environment for running PyTorch. It faciliates the training and testing processes.
Feel free to contact me if you’re interested in setting up a similar environment for your project: Consulting Services
Hobby Projects
At the time of writing this post, I am cleaning up the descriptions of my hobby projects. Updates for this section are coming soon.
Software Tools
CVAT
Annotating images is always a laborious and time-consuming process. This is especially true if you’re making annotations for image segmentation models since you have to carefully trace the outline of the objects of interest instead of just dragging a bounding box around the objects.
The Computer Vision Annotation Tool (CVAT) is the tool of my choice because its easy-to-use interface. Moreover, they provide a Docker environment such that developers don’t have to worry about installation steps.
At RIF, we have written some Python scripts to facilitate the annotation process for image segmentation even further. Instead of dragging your mouse around each object repeteadly, the scripts implement several classic computer vision methodologies to extract the outlines of the objects of interest and generate a good-enough starting point such that you’re just fixing the initial estimates.
Feel free to contact me if you’re interested in setting up a similar environment for your project: Consulting Services
FiftyOne
As you annotate images and train your models, you need to be able to visualize your progress. FiftyOne is an open source tool that facilitates the generation and management of your datasets.
The Python scripts we’ve written at RIF also interact with FiftyOne providing a seamless solution between image annotation and model training.
Technical Skills
The following summarizes my proficiency in the technical skills I improved on and acquired while working on the projects described in this post:
Python | |
Docker | |
CVAT: Image Annotation | |
PyTorch | |
TensorFlow |