Naviground

Perception system for autonomous vehicle

Naviground is a navigation system implementable in manned and unmanned terrestrial vehicles. It allows navigation in structured and unstructured environments. I participated in the development of the perception system, especially in the detection of the environment using cameras.

Vision system

Although the navigation system had LIDAR and RADAR sensors, for several reasons it was desired to have a perception system formed only by cameras.

  • Although the price of LIDAR and RADAR has decreased a lot in recent years, it is still more expensive than the cameras.
  • LIDAR and RADAR sensors are active sensors (emit an electromagnetic wave and measure the reflection), so in a war environment they make the vehicle detectable.
  • As an autonomous vehicle, the processing cannot be done on a very powerful machine, so if the processing of the amount of data that LIDAR and RADAR generate can be eliminated, it is better.

To perform the detection of the environment, we used three types of neural networks:

  • Semantic segmentation networks

    They classify what class each pixel of the image belongs to, obtaining a segmentation mask.

    Semantic segmentation
  • Object classification networks

    They can detect objects in the image using a YOLO.

    Object classification with YOLO
  • Depth estimation

    A neural network can estimate the depth of each pixel of the image, so we can obtain the distance to each object.

    Depth estimation

Training

Our problem was that as it was a vehicle for structured and unstructured environments, the pre-trained networks did not suit us, so we had to make trainings of the segmentation and object classification networks.

Dataset

We had hours of videos recorded during tests in environments like this, so we created a dataset

Videos of Naviground

We created an algorithm that, using an unsupervised classifier, created several clusters of images, where the images of each cluster were similar to each other. In this way, we stayed with a few images of each cluster, so we had a dataset with heterogeneous images.

Labeler

Labeling objects for YOLO, although it is tedious, it is a relatively fast and easy process

YOLO labeling

However, labeling images for semantic segmentation, where each pixel has to be labeled, is a slow and tedious process. As none of the labeling tools for segmentation convinced us, we built our own labeling tool. It was so good that it was reused in other projects and even talked about commercializing it.

Training images generation

One of the problems we had is that all the training images were day, with sun, without rain, etc. So to make the networks more robust we needed more images. But that means that someone has to go out at night, wait for it to rain to have images with rain, wait for it to snow, which is more complicated, etc.

At that time there were many good image generation networks, so we could generate images with new environmental conditions, but the problem was that they had to be labeled, and for segmentation it required a lot of time.

So I built a pipeline that, using generative AI, modified the environmental conditions of the images that we already had labeled, having images in different environmental conditions, but without having to lose time labeling them.

Optimization with TensorRT

As this had to work in a vehicle, it could not use a powerful computer. So a embedded device, a Jetson Orin, was used. So it was important to optimize the neural networks to make the inference as fast as possible.

I optimized them with TensorRT, making them run up to 40% faster in some cases.

ASCOD
Naviground system

Continue reading

Last posts -->

Have you seen these projects?

Horeca chatbot

Horeca chatbot Horeca chatbot
Python
LangChain
PostgreSQL
PGVector
React
Kubernetes
Docker
GitHub Actions

Chatbot conversational for cooks of hotels and restaurants. A cook, kitchen manager or room service of a hotel or restaurant can talk to the chatbot to get information about recipes and menus. But it also implements agents, with which it can edit or create new recipes or menus

Subtify

Subtify Subtify
Python
Whisper
Spaces

Subtitle generator for videos in the language you want. Also, it puts a different color subtitle to each person

View all projects -->

Do you want to apply AI in your project? Contact me!

Do you want to improve with these tips?

Last tips -->

Use this locally

Hugging Face spaces allow us to run models with very simple demos, but what if the demo breaks? Or if the user deletes it? That's why I've created docker containers with some interesting spaces, to be able to use them locally, whatever happens. In fact, if you click on any project view button, it may take you to a space that doesn't work.

Flow edit

Flow edit Flow edit

FLUX.1-RealismLora

FLUX.1-RealismLora FLUX.1-RealismLora
View all containers -->

Do you want to apply AI in your project? Contact me!

Do you want to train your model with these datasets?

short-jokes-dataset

Dataset with jokes in English

opus100

Dataset with translations from English to Spanish

netflix_titles

Dataset with Netflix movies and series

View more datasets -->