Week 5-6. Simple Q learning - new reward function and circuit completed
For today I managed to get the car to drive around the circuit at a constant speed. This has required some changes to the reward function I had so far. The change was to give more importance to the distance between the centre of the screen and the line, which is closer to the car. What I mean is that I give more importance to the near than to the far. For this purpose, a weight function is made, which in the end is a probability distribution that gives 0 to the farthest point and 1 to the nearest. For the next post I will prepare a video and some images to present how it works.