Original Source Here
Solution 1 had many limitations. Namely, it is not scalable. No city would task its workers to eyeball trash on every street, every day. Thus I considered “Solution 2”, which leverages machine learning to classify images from a camera feed. That solution was attractive because it is doable and cost-effective with commercial-off-the-shelf components. Solution 2 has two important limitations: it cannot identify multiple trash items nor can it identify different kinds of trash. Thus, I considered “Solution 3”, an object detection system that is trained to identify multiple trash types per image.
Here is a blueprint for how I sought to prepare and use my machine learning model:
The following excerpts are from my Google Colab Notebook.
First, I installed the Tensorflow Object Detection library in Colab.
Then I downloaded the TACO dataset, which contained thousands of labeled trash images in COCO format.
Next, I converted the COCO data format from XML to CSV format appropriate for Tensorflow object detectors. Then I created a data frame that contains the filename, class (‘Trash’), and bounding box coordinates.
Then I created a data shuffle and split to randomly assign images into train, test, and validate groups.
I configured an efficientDet_d0_512x512 object detector from the TensorFlow library that has pre-trained weights. This helps cut down on training time because the model already has some of the base layers configured to detect various objects. The task here is to ‘fine tune’ that base model with examples of trash so that it learns to detect my examples as well.
I then trained the model and observed its total_loss function to determine that it was training properly. Note the decreasing loss function as the model continues to run through training iterations.
I exported the model weights so that they can be reloaded at a later date as needed. This is important because otherwise I would have to retrain the model each time I wanted to run object detection inferences, which is not good because it took 1.5 hours to finish training this model.
I also confirmed that I could reliably reload my exported model as follows:
Eager to see my results, I used a script that would take a new picture the model had never seen to visualize its performance:
Here are some example outputs from this step:
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot