Week 31-34. New trainings

1 minute read

To Do

Change pose every epoch.
New action set.
New training with the new configuration.

Progress

Solving intermediate problems

Over the past few weeks I have had some problems getting the environment to achieve productive training for various reasons:

When it’s been over 20,000 times, the only thing it’s learned is to step out of line and always to the right. This change occurred when it went from 5 actions to 21.
To try to solve it we proposed to position the car in different points of the circuit and that it had different contexts in each epoch but I am still working on it since there are conflicts between Gazebo’s topics and the reset and proxy of the Python library.
I’m back to basic examples: 5 actions with a constant linear speed to try to make the robot do the curve well. The following table shows a summary of the space for observations and actions in this iteration:

Num	Observation	Min	Max
0	Vel. Lineal (m/s)	1	10
1	Vel. Angular (rad/s)	-2	2
2	Error 1	-300	300
3	Error 2	-280	280
4	Error 3	-250	250

The action spaces is:

Num	Action
0	-2
1	-1
2	0
3	1
4	2

At the same time, I’m fixing a lot of bugs I had pending (every time I restart the environment they count 3 times until a correct one starts) and refactoring the code a lot.

Working

Keep correcting mistakes like: until I load the rviz you don’t make correct steps…a very rare mistake I’m working on right now. Once it’s fixed I’ll try to launch a training.

Learning

This new step where we already want to try to get the robot to turn a little more complete but each iteration improves different things.

Twitter LinkedIn

Week 31-34. New trainings

To Do

Progress

Solving intermediate problems

Working

Learning

You May Also Enjoy

Final Weeks. Final part of the work

Weeks 52-54. Training of the reinforcement learning algorithm

Weeks 48-51. Preparing the different trainings

Weeks 44-47. Summer advance. Training using qlearn with the camera