Fixing crash and testing q-learning

Posted Dec 23, 2024 Updated Dec 27, 2024

By David Duro 2 min read

Index

Fixed episodes crash
Exponential drop in training speed
Velocity is needed in the q-table states

Fixed episodes crash

When a scenario is modified SMARTS compiles it and creates the necessary roads, missions and traffic in a specific format (xml). Each time a change is made to the scenario it is recompiled at launching it. The problem is that all these compiled files are not being cleaned up. My scenarios were full of rubbish that was probably causing a memory crash. I cleaned up the build folder of the scenario and when I compiled it again it was fixed.

Exponential drop in training speed

For some reason, the envision simulation crashes after many simulations (about 400). Also, the speed of each episode decays exponentially, making it almost impossible to reach more than 600-700 episodes. I have to investigate the source of this problem further.

Velocity is needed in the q-table states

After a lot of exercise testing and using 200 steps per episode. It usually converges to a solution at 200-250 episodes. So far I have not been able to adjust the parameters to get an optimal solution.

I have discovered that this approach is wrong. I will explain this with a simple example:

Right now the agent starts very close to the car behind. His first actions will be to accelerate, but once he is close to the ideal position, he will have to decelerate. Let’s say he has to start decelerating -0.5m from the ideal position to reach the ideal position with speed 0. In another scenario where the agent starts -0.5m from the ideal position, its first actions will be to accelerate, assuming it must decelerate at another time.

By this I mean, that right now we are not teaching an agent to park in 1d, we are training an agent to park from a specific position. Therefore, I think that if we want to get the agent to park from any position, we must add another state, its linear velocity.

This may be a better approximation, but it will radically increase the number of possible states and further slow down training. This should not be a problem, but I am concerned about how long each simulation will take.

phase 2

This post is licensed under CC BY 4.0 by the author.

Index

Fixed episodes crash

Exponential drop in training speed

Velocity is needed in the q-table states

Trending Tags