Enhancing End-to-End Control in Autonomous Driving through Kinematic-Infused and Visual Memory Imitation Learning

Neurocomputing, 2024

Sergio Paniego1,2, Roberto Calvo-Palomino1,2, JoseMaria Cañas1,2

1: URJC

2: JdeRobot

DOI: 10.1016/j.neucom.2024.128161

Abstract

This paper presents an exploration, study, and comparison of various alternatives to enhance the capabilities of an end-to-end control system for autonomous driving based on imitation learning by adding visual memory and kinematic input data to the deep learning architectures that govern the vehicle. The experimental comparison relies on fundamental error metrics (MAE, MSE) during the offline assessment, supplemented by several external complementary fine-grain metrics based on the behavior of the ego vehicle at several urban test scenarios in the CARLA reference simulator in the online evaluation. Our study focuses on a lane-following application using different urban scenario layouts and visual bird-eye-view input. The memory addition involves architectural modifications and different sensory input types. The kinematic data integration is managed with a modified input. The experiments encompass both typical driving scenarios and extreme never-seen conditions. Additionally, we conduct an ablation study examining various memory lengths and densities. We prove experimentally that incorporating visual memory capabilities and kinematic input data makes the driving system more robust and able to handle a wider range of challenging situations, including those not encountered during training, in terms of reduction of collisions and speed self-regulation, resulting in a 75% enhancement. All the work we present here, including model architectures, trained model weights, comparison tool, and the dataset, is open-source, facilitating replication and extension of our findings.

Materials

Citation

@article{PANIEGO2024128161,
    title = {Enhancing end-to-end control in autonomous driving through kinematic-infused and visual memory imitation learning},
    journal = {Neurocomputing},
    volume = {600},
    pages = {128161},
    year = {2024},
    issn = {0925-2312},
    doi = {https://doi.org/10.1016/j.neucom.2024.128161},
    url = {https://www.sciencedirect.com/science/article/pii/S0925231224009329},
    author = {Sergio Paniego and Roberto Calvo-Palomino and JoséMaría Cañas},
    keywords = {End-to-end autonomous driving, Imitation learning, Deep learning, Lane-following},
    abstract = {This paper presents an exploration, study, and comparison of various alternatives to enhance the capabilities of an end-to-end control system for autonomous driving based on imitation learning by adding visual memory and kinematic input data to the deep learning architectures that govern the vehicle. The experimental comparison relies on fundamental error metrics (MAE, MSE) during the offline assessment, supplemented by several external complementary fine-grain metrics based on the behavior of the ego vehicle at several urban test scenarios in the CARLA reference simulator in the online evaluation. Our study focuses on a lane-following application using different urban scenario layouts and visual bird-eye-view input. The memory addition involves architectural modifications and different sensory input types. The kinematic data integration is managed with a modified input. The experiments encompass both typical driving scenarios and extreme never-seen conditions. Additionally, we conduct an ablation study examining various memory lengths and densities. We prove experimentally that incorporating visual memory capabilities and kinematic input data makes the driving system more robust and able to handle a wider range of challenging situations, including those not encountered during training, in terms of reduction of collisions and speed self-regulation, resulting in a 75% enhancement. All the work we present here, including model architectures, trained model weights, comparison tool, and the dataset, is open-source, facilitating replication and extension of our findings.}
}