Carla follow lane DDPG Vs PPO Vs SAC [February]
Got it using 2 networks, one for v and one for w. However, it is a shortcut against our intuition, so we will go back trying with 1 network.
Reward and learning process explained in the following slides