Armando Mateus Week 13

Week 13 - PilotNet Deployment and Real-Time Performance Evaluation

December 23, 2025

Comparative analysis of PilotNet inference performance and real-world autonomous driving behavior

This week focused on deploying the PilotNet architecture using the enhanced dataset from Week 12 and evaluating its performance in both offline and online testing environments. The work included training and benchmarking Keras (h5) and TensorFlow Lite (tflite) models, followed by real-time autonomous driving tests to identify behavioral challenges and define future dataset augmentation strategies.

Figure 1: Week 12 enhanced dataset composition for steering across 7 categories.

1. Model Training and Deployment:
Using the enhanced dataset from Week 12 (Figure 1), the PilotNet architecture was implemented in both Keras (h5) and TensorFlow Lite (tflite) formats. A dedicated Python environment was set up to install the required packages and ensure compatibility across frameworks. The training process successfully produced both model formats, which were subsequently evaluated for inference performance.

2. Inference Time Benchmarking:
Inference times were measured for each model variant. The complete control loop—including camera capture, inference, and control action—must not exceed 50ms to guarantee a 20Hz execution rate. The results are summarized below, with MobileNet (PyTorch) included as a reference:

INFERENCE TIME COMPARISON:

MobileNet - PyTorch Model (pth)

• Minimum inference time: 15ms

• Maximum inference time: 32ms

• Average inference time: 18ms

PilotNet - Keras Model (h5)

• Minimum inference time: 150ms

• Maximum inference time: 315ms

• Average inference time: 240ms

PilotNet - TensorFlow Lite Model (tflite)

• Minimum inference time: 1ms

• Maximum inference time: 2ms

• Average inference time: 2ms

The TensorFlow Lite model was selected for real-time deployment due to its ability to meet the 20Hz requirement, whereas the Keras model exceeded the allowable inference window.

3. Offline Testing Results:
Offline verification tests showed identical behavior between the h5 and tflite PilotNet models, as illustrated in Figure 2. For reference, the MobileNet performance under the same conditions is shown in Figure 3.

Expert driving vs autonomous driving for enhanced dataset of week 12: autonomous driving based on PilotNet with same results for H5 and tflite models

Figure 2: Expert driving vs autonomous driving for enhanced dataset of week 13: autonomous driving based on PilotNet with same results for H5 and tflite models.

Figure 3: Expert driving vs autonomous driving for enhanced dataset of week 12: autonomous driving based on MobileNet.

4. Online Testing and Behavioral Analysis:
The TensorFlow Lite model was deployed for real-time autonomous driving tests. The following driving characteristics were observed:

Oscillations during straight driving: Possibly caused by variability in "straight driving examples" within the dataset due to inconsistencies in expert driving behavior.
Straight driving in non-permitted areas: Instances where the vehicle drives too close to or slightly onto sidewalks or lane dividers, and instead of correcting back to the right lane, it continues straight.
Early turns: In some right and left turns, the system initiates steering action prematurely, leading to sidewalk invasion or collisions with obstacles.

Given the nature of these issues, the next step is to augment the dataset with a significant number of examples covering these challenging scenarios. This approach aligns with the recommendations from Carlos Andrés Velasquez’s work (https://roboticslaburjc.github.io/2023-phd-carlos-velasquez/weekly%20log/week74/), whose insights are being incorporated into this project.

5. Autonomous Driving Demonstration:
Below is a video showing the autonomous vehicle in operation using the PilotNet/tflite model:

Figure 4: Autonomous driving with PilotNet/tflite.

Conclusion:
This week’s evaluation confirmed the superior inference speed of the TensorFlow Lite version of PilotNet, making it suitable for real-time deployment at 20Hz. However, behavioral issues during online tests highlight the need for further dataset refinement. The next steps will focus on targeted data collection to address oscillations, improper lane-keeping, and premature turning, following established research recommendations to improve model robustness and driving safety.