Vehicle recognizer model
Vehicle recognizer edge model
Recently, in our daily lives, we are encountering more and more applications of computer vision specifically in respect to motor vehicles. This could be a system to control parking space availability and payments, toll collection, security and tracking or even a smart camera to warn the driver.
In this particular case, our task was to train a model that can determine whether two given images are of the same car. Of course, the model has to run on constrained hardware and be customized for the domain (specific dataset, sensors and lenses).
The problem
Training data is available in the form of massive amount of compressed video footage from diverse locations, times of day, seasons, traffic conditions and so on.
The hardware is available in the form of both training machines as well as edge devices and cameras that took the videos in the dataset.
Core requirements:
- Model is fast to train, fast to execute and reasonably accurate (90%+ over 10+ images)
- System runs on the edge device and embedded camera
- System compares images effeciently by computing embedding (“measurement”) vectors and managing a run-time database
- System tracks the vehicle appearance frame sequences and assigns known/uknown scores to those with many-to-many matches for higher quality
- Quick to build / PoC made with as much off-the-shelf components as possible
The solution
As the first step, the training data was processed using object detection model (YOLO V9, hosted on multiple GPUs under Nvidia’s Triton inference server) and OpenCV tracking to isolate images in the form of triplets: two of the same car and one different car. About 50000 triplets of data were then selected, cleaned and packaged for training.
The recognizer model was built by adding several fully-connected layers on top of Google MobileNet V3 headless network. The model was trained for several hundred epochs with the triplet loss function and good convergence was observed.
Both YOLO and recognizer models were then quantized and compiled for the edge device (Nvidia Jetson Nano). Together with the VectorDB vector database (to keep the embeddings) the system was deployed to the actual hardware and expected performance was achieved. There were some unforeseen challenges along the way and a coming series of blog posts on this site will cover this work.
Disclaimer: due to proprietary nature of work done for the customers and employers, the case studies are merely inspired by that work, are presented at a very high level and some sensitive details have been changed or omitted.
Interested in what you see?
Start your journey with us