Case study categories:

Vehicle recognizer edge model

Recently, in our daily lives, we are encountering more and more applications of computer vision specifically in respect to motor vehicles. This could be a system to control parking space availability and payments, toll collection, security and tracking or even a smart camera to warn the driver.

In this particular case, our task was to train a model that can determine whether two given images are of the same car. Of course, the model has to run on constrained hardware and be customized for the domain (specific dataset, sensors and lenses).

The problem

Training data is available in the form of massive amount of compressed video footage from diverse locations, times of day, seasons, traffic conditions and so on.

The hardware is available in the form of both training machines as well as edge devices and cameras that took the videos in the dataset.

Core requirements:

Model is fast to train, fast to execute and reasonably accurate (90%+ over 10+ images)
System runs on the edge device and embedded camera
System compares images effeciently by computing embedding (“measurement”) vectors and managing a run-time database
System tracks the vehicle appearance frame sequences and assigns known/uknown scores to those with many-to-many matches for higher quality
Quick to build / PoC made with as much off-the-shelf components as possible

The solution

As the first step, the training data was processed using object detection model (YOLO V9, hosted on multiple GPUs under Nvidia’s Triton inference server) and OpenCV tracking to isolate images in the form of triplets: two of the same car and one different car. About 50000 triplets of data were then selected, cleaned and packaged for training.

The recognizer model was built by adding several fully-connected layers on top of Google MobileNet V3 headless network. The model was trained for several hundred epochs with the triplet loss function and good convergence was observed.

Both YOLO and recognizer models were then quantized and compiled for the edge device (Nvidia Jetson Nano). Together with the VectorDB vector database (to keep the embeddings) the system was deployed to the actual hardware and expected performance was achieved. There were some unforeseen challenges along the way and a coming series of blog posts on this site will cover this work.

Vehicle recognizer model

Vehicle recognizer edge model

The problem

The solution