3 minute read

This week comes as a good point of inflection because by understanding more about the PilotNet and how it works, it was possible to solve a big chunk of the problems I had this past few weeks. But once again, before we see the results right away, it would nice to check what were the changes that led us to finding a good solution.

The model arquitecture

Being one big part of the whole deep learning project, the model was one of the few things we didn’t quite tweak to solve our little problem. This problem consisted of how we translated the output to the Carla Simulation, this has been a big issue because the output always gave us values bigger than one (like values between 100 and 600). This by itself was the main issue that needed fixing for we weren’t dealing correctly with how the model trained from the input data to the output values. So, once we had that checked, it was the moment to try different configurations with the PilotNet model and finally, the holy grail to finding a solution was: normalization.

But by normalizing the test image, we had a model with a pretty big systematic offset between the predictions and the groundtruth that obviously wasn’t learning at all. The next image is an example of the previous model and its prediction over the normalized test image, it showed me that it was certainly learning something, but didn’t correlate with the groundtruth scope of the steering values. So, being quite sure that the normalization was part of the problem, we decided to tamper with the arquitecture of the model.

Graph of the predicted steering values (red) against the groundtruth (green).

The normalization part of this project was already covered firstly by a BatchNormalization introduced in the first layer of the model, and secondly on the augmentations of the input images. But it resulted to be insufficient for our problem, by having so many parameters from the convolutional layer passed on to the dense layers, we tried to scale this parameters by using a normalization layer, and once this was added, we checked wether the model was learning and was staying within the expected range of 0-1.

And if we are understanding correctly from the previous graph, it was! The model was adjusting itself from within the correct range and learning the good weights to adjust itself to the groundtruth.

Training

The training part stays pretty much the same as the previous week, for the problem wasn’t there, not knowing that the solution wasn’t on the dataset itself or the training parameters but on the model itself. By balancing the dataset as we did and train it for 120 epochs, finally we found an almost good configuration for the follow road task we had for the Carla Simulator.

Histogram of the balanced dataset
Graph of the loss value along 120 epochs

One thing to notice is that after 40 epochs, the loss stabilize, making it good to stop the training earlier instead of running it for 120 epochs.

Results

Finally, the results shown in the next two videos are pretty much self-explanatory, where the first video shows us an easier example in which we teached the car to turn correctly a single curve being the approach from either left or right.

In this second video, the car will run through three different maps. One of them will be a known map (Town05), a map used as a dataset for training the car, and the other two towns are maps that the car never saw before, for they weren’t used for the training and validation. We can see that it performs really well as expected, noticing that it turns correctly on sharp and on smoother curves.

On the other hand, as we anticipated when it encounters a junction (Town04), the car is not able to overcome it for we never trained it on this type of situations. Lastly, we can notice that it didn’t stayed on his own lane when running on the Town04. If we try to come up with an explanation, a possible theory is that some of the maps had multiple lanes in one direction. The car never changed lanes in the middle of a run, but by having different spawnpoints from which to start recollecting the dataset, the car might have used the different lanes and therefore it is not able to distinguish if it can run over one lane or not.