- Change pose every epoch.
- New action set.
- New training with the new configuration.
Solving intermediate problems
Over the past few weeks I have had some problems getting the environment to achieve productive training for various reasons:
- When it’s been over 20,000 times, the only thing it’s learned is to step out of line and always to the right. This change occurred when it went from 5 actions to 21.
- To try to solve it we proposed to position the car in different points of the circuit and that it had different contexts in each epoch but I am still working on it since there are conflicts between Gazebo’s topics and the reset and proxy of the Python library.
- I’m back to basic examples: 5 actions with a constant linear speed to try to make the robot do the curve well. The following table shows a summary of the space for observations and actions in this iteration:
|0||Vel. Lineal (m/s)||1||10|
|1||Vel. Angular (rad/s)||-2||2|
The action spaces is:
- At the same time, I’m fixing a lot of bugs I had pending (every time I restart the environment they count 3 times until a correct one starts) and refactoring the code a lot.
Keep correcting mistakes like: until I load the rviz you don’t make correct steps…a very rare mistake I’m working on right now. Once it’s fixed I’ll try to launch a training.
This new step where we already want to try to get the robot to turn a little more complete but each iteration improves different things.