Week 6 - Dataset Analysis and Initial Model Training
November 4, 2025
Advancing towards balanced datasets and functional autonomous driving
This week represents a significant milestone in our autonomous driving research project, with substantial progress in dataset management, analysis techniques, and the development of our first functional autonomous driving model.
CARLA Replay System Implementation
We have successfully implemented and tested CARLA's replay functionality, enabling the reproduction of previously recorded driving routes. This capability allows for systematic capture of synchronized image sequences from multiple perspectives. However, we identified a critical technical constraint: the image capture resolution directly impacts the minimum achievable time interval between frames. Higher resolution settings introduce processing latency that can result in temporal inconsistencies and occasional capture errors within the image sequences.
Advanced 3D Histogram Analysis
To gain deeper insights into our dataset characteristics, we developed comprehensive 3D histogram visualizations that provide multidimensional analysis of parameter distributions. Figure 1 presents a consolidated summary histogram showing the overall distribution patterns across our dataset parameters.
Building upon this foundation, Figure 2 illustrates our advanced 3D histogram analysis, which reveals the complex interrelationships between steering angles, throttle values, and vehicle speed across different driving scenarios.
Dataset Imbalance Identification
Our comprehensive analysis has confirmed that the current dataset exhibits significant class imbalance, particularly in steering angle distributions where straight-line driving scenarios dominate. This imbalance presents a substantial challenge for model training, as it can lead to biased predictions favoring the majority classes. Addressing this dataset imbalance through strategic sampling techniques and data augmentation constitutes our primary objective for the upcoming week.
Initial Model Training and Performance Metrics
Despite the dataset imbalance challenges, we proceeded with initial model training using our MobileNet-based architecture. The training process yielded promising results, with the performance metrics detailed in Figure 3 demonstrating the model's learning capability across key parameters.
Autonomous Driving Validation in CARLA Simulation
We conducted comprehensive testing of our trained model within the CARLA simulation environment to evaluate its autonomous driving capabilities. The model demonstrated functional right-turn identification and execution, successfully recognizing turning scenarios in approximately 50% of test cases. However, as illustrated in Figure 4, the model exhibits inconsistent performance in turn execution, highlighting areas for improvement in our next development cycle.
These results, while preliminary, validate our architectural approach and provide a solid foundation for iterative improvement through dataset refinement and model optimization in subsequent development phases.
Reference:
[1] A. Moncalvillo González, "Seguimiento de carril por visión y conducción autónoma con Aprendizaje por Imitación (Vision-based Lane Keeping and Autonomous Driving using Imitation Learning)," Master's thesis, Universidad Rey Juan Carlos, Madrid, Spain, 2024.