Qlearning cartpole problem included
MIGRATION
You can find the cartpole proposed problem, actions, states and rules to consider the problem solved in the official openAiGym website and in this post
The steps to migrate cartpole to rl-studio were the following:
- Adapt rl-studio project structure and configuration files, so it is adequate to a new simulator (openAiGym)
- Adapt the qlearning algorithm to accept different type of states (make it more general)
- Create a new environment to enable different configuration of actions, goals, steps, rewards, etc. So we can adapt the problem at our preference
- Create inference mode in cartpole problem
you can find all the iterations tested in the results uploaded in the repository.
GOAL
The problem is considered solved when the average accumulated reward is at least 195 out of 100 trials.
DEMO
As it can be seen in the following video, the goal defined for cartpole_v0 in openAiGym was achieved and surpassed.
The algorithm used in this experiment was qlearning.
The hyperparameters are indicated below:
- alpha: 0.8
- gamma: 0.95
- epsilon discount: 0.99999995