less than 1 minute read

In the last post I published an implementation of the simplified Q learning algorithm for this problem in which I discretised both the space of observations and the space of actions in order to form the Q table. The problem with this implementation is that it worked well on the straight lines, but the curves did not manage to catch all of them. This meant that the algorithm was not able to complete a single lap of the circuit, as it got stuck in the tightest corners. To solve this problem, the action space has been increased to allow for sharper turns, i.e. the angular velocity range was [-0.3, 0.3] and is now [-1, 1]. The reward function remains the same. As I see results, I will upload new posts. So far I have observed that the model is learning to take the curves more quickly, but it has not entered long enough to be able to check how well it takes the curves.