Torola.ai Logo

Vehicle recognizer model

Vehicle recognizer edge model

Recently, in our daily lives, we are encountering more and more applications of computer vision specifically in respect to motor vehicles. This could be a system to control parking space availability and payments, toll collection, security and tracking or even a smart camera to warn the driver.

In this particular case, our task was to train a model that can determine whether two given images are of the same car. Of course, the model has to run on constrained hardware and be customized for the domain (specific dataset, sensors and lenses).

The problem

Training data is available in the form of massive amount of compressed video footage from diverse locations, times of day, seasons, traffic conditions and so on.

The hardware is available in the form of both training machines as well as edge devices and cameras that took the videos in the dataset.

Core requirements:

  • Model is fast to train, fast to execute and reasonably accurate (90%+ over 10+ images)
  • System runs on the edge device and embedded camera
  • System compares images effeciently by computing embedding (“measurement”) vectors and managing a run-time database
  • System tracks the vehicle appearance frame sequences and assigns known/uknown scores to those with many-to-many matches for higher quality
  • Quick to build / PoC made with as much off-the-shelf components as possible

The solution

System design As the first step, the training data was processed using object detection model (YOLO V9, hosted on multiple GPUs under Nvidia’s Triton inference server) and OpenCV tracking to isolate images in the form of triplets: two of the same car and one different car. About 50000 triplets of data were then selected, cleaned and packaged for training.

The recognizer model was built by adding several fully-connected layers on top of Google MobileNet V3 headless network. The model was trained for several hundred epochs with the triplet loss function and good convergence was observed.

Hardware Both YOLO and recognizer models were then quantized and compiled for the edge device (Nvidia Jetson Nano). Together with the VectorDB vector database (to keep the embeddings) the system was deployed to the actual hardware and expected performance was achieved. There were some unforeseen challenges along the way and a coming series of blog posts on this site will cover this work.

Disclaimer: due to proprietary nature of work done for the customers and employers, the case studies are merely inspired by that work, are presented at a very high level and some sensitive details have been changed or omitted.

Interested in what you see?

If you got inspired by what you see and want to create something with our help - don't hesitate to reach out. Get in touch

Start your journey with us

We know that working with new partners is difficult and risky. To help make this first step easier - we are happy to offer no-commitments, free consultation* with one of our engineers when you first reach out.

Start Simple, Scale At Your Own Pace:

Your Central European Software Services Partner

Privacy policy | © 2024 Torola. All rights reserved.