Personal webpage for TFM Students.
View the Project on GitHub RoboticsLabURJC/2017-tfm-vanessa-fernandez
Driving results (classification networks) | ||||||||
Manual | 5v+7w biased | 5v+7w balanced | 5v+7w imbalanced | |||||
Circuits | Percentage | Time | Percentage | Time | Percentage | Time | Percentage | Time |
Simple (clockwise) | 100% | 1min 35s | 100% | 1min 41s | 75% | 100% | 1min 42s | |
Simple (anti-clockwise) | 100% | 1min 32s | 100% | 1min 39s | 100% | 1min 39s | 100% | 1min 43s |
Monaco (clockwise) | 100% | 1min 15s | 100% | 1min 20s | 70% | 85% | ||
Monaco (anti-clockwise) | 100% | 1min 15s | 100% | 1min 18s | 8% | 100% | 1min 20s | |
Nurburgrin (clockwise) | 100% | 1min 02s | 100% | 1min 03s | 100% | 1min 03s | 100% | 1min 05s |
Nurburgrin (anti-clockwise) | 100% | 1min 02s | 100% | 1min 05s | 80% | 80% |
Driving results (classification networks) | ||||||||
Manual | 5v+7w biased | 5v+7w balanced | 5v+7w imbalanced | |||||
Circuits | Percentage | Time | Percentage | Time | Percentage | Time | Percentage | Time |
Simple (clockwise) | 100% | 1min 35s | 35% | 10% | 90% | |||
Simple (anti-clockwise) | 100% | 1min 32s | 100% | 1min 49s | 100% | 1min 46s | 90% | |
Monaco (clockwise) | 100% | 1min 15s | 100% | 1min 24s | 5% | 100% | 1min 23s | |
Monaco (anti-clockwise) | 100% | 1min 15s | 100% | 1min 29s | 8% | 100% | 1min 24s | |
Nurburgrin (clockwise) | 100% | 1min 02s | 100% | 1min 10s | 8% | 90% | ||
Nurburgrin (anti-clockwise) | 100% | 1min 02s | 100% | 1min 07s | 8% | 100% | 1min 09s |
In this paper (https://github.com/Kejie-Wang/End-to-End-Learning-for-Self-Driving-Cars), a convolutional neural network (CNN) is trained to map raw pixels from a single front-facing camera directly to steering commands. The system automatically learns internal representations of the necessary processing steps such as detecting useful road features with only the human steering angle as the training signal.
Images are fed into a CNN which then computes a proposed steering command. The proposed command is compared to the desired command for that image and the weights of the CNN are adjusted to bring the CNN output closer to the desired output. The weight adjustment is accomplished using back propagation. Once trained, the network can generate steering from the video images of a single center camera.
Training data was collected by driving on a wide variety of roads and in a diverse set of lighting and weather conditions. Most road data was collected in central New Jersey, although highway data was also collected from Illinois, Michigan, Pennsylvania, and New York. Other road types include two-lane roads (with and without lane markings), residential roads with parked cars, tunnels, and unpaved roads. Data was collected in clear, cloudy, foggy, snowy, and rainy weather, both day and night. 72 hours of driving data was collected.
They train the weights of their network to minimize the mean squared error between the steering command output by the network and the command of either the human driver, or the adjusted steering command for off-center and rotated images. The network consists of 9 layers, including a normalization layer, 5 convolutional layers and 3 fully connected layers. The input image is split into YUV planes and passed to the network.
The first layer of the network performs image normalization. The convolutional layers were designed to perform feature extraction and were chosen empirically through a series of experiments that varied layer configurations. Theye use strided convolutions in the first three convolutional layers with a 2×2 stride and a 5×5 kernel and a non-strided convolution with a 3×3 kernel size in the last two convolutional layers. They follow the five convolutional layers with three fully connected layers leading to an output control value which is the inverse turning radius. The fully connected layers are designed to function as a controller for steering, but it is not possible to make a clean break between which parts of the network function primarily as feature extractor and which serve as controller.
To train a CNN to do lane following they only select data where the driver was staying in a lane and discard the rest. They then sample that video at 10 FPS. A higher sampling rate would result in including images that are highly similar and thus not provide much useful information. After selecting the final set of frames they augment the data by adding artificial shifts and rotations to teach the network how to recover from a poor position or orientation.
Before road-testing a trained CNN, they first evaluate the networks performance in simulation.
In this method I test a new network with the difference image of it and it-5. The results are:
Driving results (regression networks) | ||||||||
Temporal_dif const v whole image | Temporal_dif whole image | Temporl_dif const v cropped image | Temporal_dif cropped image | |||||
Circuits | Percentage | Time | Percentage | Time | Percentage | Time | Percentage | Time |
Simple (clockwise) | 100% | 3min 37s | 100% | 1min 43s | 100% | 3min 37s | 100% | 1min 39s |
Simple (anti-clockwise) | 100% | 3min 38s | 100% | 1min 44s | 100% | 3min 38s | 100% | 1min 42s |
Monaco (clockwise) | 45% | 5% | 45% | 5% | ||||
Monaco (anti-clockwise) | 45% | 5% | 8% | 5% | ||||
Nurburgrin (clockwise) | 8% | 8% | 8% | 8% | ||||
Nurburgrin (anti-clockwise) | 90% | 8% | 90% | 8% |
Difference image: