Included random initial positions and random perturbations to cartpole problem (month 23)

less than 1 minute read

This month we focused on the two following goals:

  • Make cartpole learning stable enough making use of DQN, adaptative learning and replay buffers
  • Investigate about reward policies alternatives and some ways to mitigate catastrophic forgetting
  • Include monitorization to cartpole to know how solid is our solution (TODO include this monitorization as a common library in RL-Studio)

  • you can see more details about the projec status in cartpole project post