Deploy your own SSDLite Mobiledet object detector on Google Coral’s EdgeTPU using Tensorflow’s Object Detection API

Nam Vu
7 min readAug 15, 2020

TL,DR;
In this article, you will learn how to create your own object detection model Mobiledet + SSDLite using Tensorflow’s Object Detection API and then deploy it on the EdgeTPU. Yes, I packed all the buzz words in that one sentence, but here is a bonus: we’ll do this on a Testla T4 16GB GPU provided by Google for free on a Colab notebook!! You ready?

Figure 1. Actual evaluation after only ~3000 steps.

Introduction:

  • Coral EdgeTPU is an ASIC chip designed by Google for accelerated ML tasks on edge devices. Using this I was able to reduce inference latency from ~755 ms to ~8 ms in the final model in this tutorial (that’s ~94x speedup, worddddd).
  • MobileDet is a new image classifier model architecture that came out back in April 2020. Well, saying that its new isn’t 100% accurate because this idea has been around much longer. In essence, it is a step back from depth-wise convolution, which was previously thought to be more effective for mobile processors and edge devices like EdgeTPUs and DSPs, in order to experiment with just regular convolution. While I admittedly do not have full understanding of the paper, the results were quite convincing. In summary, Mobiledet+SSDLite outperforms the MobileNetV3+SSDLite and MobileNetV2+SSDLite by 1.7 mAP and 1.9 mAP respectively on CPU and even higher on the EdgeTPUs and DSPs by 3.7 mAP and 3.4 mAP respectively[1].
  • Colab Notebook is a research project created by Google that allows users to share code, collaborate and most importantly, use free GPU for training. For this tutorial, I’ll share my Colab notebook so you can follow along and create your own model.
  • Tesla T4 16GB is the free GPU that comes with the Colab notebook, in case you missed the last bullet :)
  • Tensorboard is a visualizing tool that we will be using to track the progress of our training. It is what produced the image in Figure 1.

On top of the above technologies, I’ll walk you through the entire pipeline process of training and deploying this model on the EdgeTPU.

Figure 2. The basic workflow to create a model for the EdgeTPU (flowchart provided on coral.ai)

The data set that I’m using for this tutorial is the Oxford-IIIT Pet Dataset. It contains thousands of images of 37 different breeds of cats and dogs, as well as the annotated bounding boxes for each image. However, we’ll only be using the first 2 classes, the Abyssinian cat and the American bulldog. This is efficient for our model to distinguish between cats and dogs. FYI: this is the same configuration that was used for the coral’s object detection transfer learning tutorial in docker.

Demo:

Before we jump right into the tutorial, I wanted to give a quick demo of the model that we will be producing together. Here is a 2 steps process to run the model over some pet images (assuming you already have python3.x and the tflite_runtime package installed):

$ git clone https://github.com/Namburger/edgetpu-ssdlite-mobiledet-retrain && cd edgetpu-ssdlite-mobiledet-retrain# Run this if you have an EdgeTPU:
$ python3 run_model.py models/ssdlite_mobiledet_dog_vs_cat_edgetpu.tflite test_images
# Run this if you don't have an EdgeTPU:
$ python3 run_model.py models/ssdlite_mobiledet_dog_vs_cat.tflite test_images
Figure 3. A quick demo

Let’s go!

Before we get started, I invite you to visit this Colab notebook so that you can follow along with the tutorial. Here is my github page hosting the colab:

1. Import tensorflow and install the Object Detection API as instructed here:

Note: we are using the tensorflow1.x Object Detection API since this is still the recommended version for the coral’s workflow.

2. Download and prepare the data set:

Note: Here we’re downloading the images as well as the annotations for the data. The second part of the above snippet basically pick out the Abyssinian cat and the American bulldog from the full list of training and validation data set. If you want to train the model on the entire data set, simply skip the second part of this snippet. I conveniently put it in a separate cell on the note book so you don’t have to comment it out.

3. Create tfrecord files from our training data set:

Note: tfrecord is a tensorflow’s required file format. It’s basically a structured data format called protocol buffers which allows the tensorflow library to serialize and pack all the images into a set of files that is made for efficient reads.

4. Download the pre-trained SSDLite Mobiledet model from tf’s model zoo:

5. Now we edit the pipeline config to configure the training:

Note: Here you may have noticed that I developed a hacky way to get the gpu info from this cell (line 6–12). All that it does is check what type of GPU was provided. You see, although the Testla T4 is nice, Colab also has a “pro” option where you can make a monthly subscription of $10 to get a much faster gpu, such as the Testla P100-PCIe. This is completely unnecessary for this tutorial, I just toggle some options around, such as lowering the batch size and trained for less steps on line 25 and 26. If you decided to train this model for the entire data set instead of just the first 2 classes, be sure to change num_classes = 37 instead of 2. The important part of this snippet is that I have made the necessary changes to give path to our data set, label map, add quantization aware training, and enable ssdlite. The produced pipeline.config file should looks like this:

6. Start tensorboard:

Note: Tensorboard allows you to track progress of the training. Since training may take a very long time, it is very nice to know the status of each checkpoint. After starting it, you should get a message like this:

Click on link below to track progress: 
https://967b869b43e1.ngrok.io

You can open it in a new browser tab, after you start training, some graphs and evaluation images will be shown by Tensorboard:

Figure 4. mAP and AR progression graphs

7. Here we go, let’s start training:

This is it, now you can sit back and relax for a few hours. You may start seeing the training will occasionally save some checkpoint. A part of me believe this is the case because….

Figure 5. A dank meme

Anyhow, in the outputs of the training, you may see something like this every once in a while:

This is the text representation of the graphs from Figure 4. The tensorflow’s Object Detection API basically evaluates the model after saving each checkpoint and calculate the mAP and AR over different regions. For instance, the IoU=0.50 | area=all corresponds to the mAP at 50% IoU (intersection over union) of all area sizes. Here is a really good article on how to understand mAP, AR, and IoU[2].

8. Training is done, let’s evaluate the model:

  • First, export the checkpoint to an inference graph:
  • Next, we download a few images to test:
  • Finally, we can run the code:
  • Here is one of the result:

9. Results look good; let’s evaluate tflite model:

  • Export checkpoints to tflite graph:
  • Convert model to a .tflite model:
  • Evaluate .tflite model, note that this exact code can be used to run model on the EdgeTpu (Dev Board or Accelerator):
  • Here is another result from the tflite model:

10. That’s it, let’s wrap it up and download the model:

The above snippet does 3 things:
1) Install the edgetpu compiler and compile the model.
2) Package all model files, the latest checkpoint, pipeline config into a tarball.
3) Download everything into your host machine.

Conclusion:

yes.

References
1. Yunyang Xiong, Hanxiao Liu, Suyog Gupta, Berkin Akin, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Vikas Singh, Bo Chen:
MobileDets: Searching for Object Detection Architectures for Mobile Accelerators. CoRR abs/2004.14525 (2020)
2. Jonathan Hui: mAP (mean Average Precision) for Object Detection.

--

--