Week 6 - Dataset Analysis and Initial Model Training

November 4, 2025

Advancing towards balanced datasets and functional autonomous driving

This week represents a significant milestone in our autonomous driving research project, with substantial progress in dataset management, analysis techniques, and the development of our first functional autonomous driving model.

CARLA Replay System Implementation

We have successfully implemented and tested CARLA's replay functionality, enabling the reproduction of previously recorded driving routes. This capability allows for systematic capture of synchronized image sequences from multiple perspectives. However, we identified a critical technical constraint: the image capture resolution directly impacts the minimum achievable time interval between frames. Higher resolution settings introduce processing latency that can result in temporal inconsistencies and occasional capture errors within the image sequences.

Advanced 3D Histogram Analysis

To gain deeper insights into our dataset characteristics, we developed comprehensive 3D histogram visualizations that provide multidimensional analysis of parameter distributions. Figure 1 presents a consolidated summary histogram showing the overall distribution patterns across our dataset parameters.

Consolidated histogram summary showing parameter distributions
Figure 1. Consolidated histogram summary across the complete dataset.

Building upon this foundation, Figure 2 illustrates our advanced 3D histogram analysis, which reveals the complex interrelationships between steering angles, throttle values, and vehicle speed across different driving scenarios.

Three-dimensional histogram analysis showing parameter correlations
Figure 2. Three-dimensional histogram analysis revealing correlations between steering, throttle, and speed parameters in various driving conditions.

Dataset Imbalance Identification

Our comprehensive analysis has confirmed that the current dataset exhibits significant class imbalance, particularly in steering angle distributions where straight-line driving scenarios dominate. This imbalance presents a substantial challenge for model training, as it can lead to biased predictions favoring the majority classes. Addressing this dataset imbalance through strategic sampling techniques and data augmentation constitutes our primary objective for the upcoming week.

Initial Model Training and Performance Metrics

Despite the dataset imbalance challenges, we proceeded with initial model training using our MobileNet-based architecture. The training process yielded promising results, with the performance metrics detailed in Figure 3 demonstrating the model's learning capability across key parameters.

Performance metrics from initial model training
Figure 3. Training performance metrics showing loss convergence and parameter prediction accuracy across training epochs.

Autonomous Driving Validation in CARLA Simulation

We conducted comprehensive testing of our trained model within the CARLA simulation environment to evaluate its autonomous driving capabilities. The model demonstrated functional right-turn identification and execution, successfully recognizing turning scenarios in approximately 50% of test cases. However, as illustrated in Figure 4, the model exhibits inconsistent performance in turn execution, highlighting areas for improvement in our next development cycle.

Autonomous vehicle testing results in CARLA simulator
Figure 4. Autonomous driving test results showing successful right-turn identification with partial execution success rate in CARLA simulation environment.

These results, while preliminary, validate our architectural approach and provide a solid foundation for iterative improvement through dataset refinement and model optimization in subsequent development phases.

Reference:

[1] A. Moncalvillo González, "Seguimiento de carril por visión y conducción autónoma con Aprendizaje por Imitación (Vision-based Lane Keeping and Autonomous Driving using Imitation Learning)," Master's thesis, Universidad Rey Juan Carlos, Madrid, Spain, 2024.