Difference between revisions of "Reinforcement Learning"
|  (→A2C) |  (→Training algorithms) | ||
| Line 7: | Line 7: | ||
| ===[https://en.wikipedia.org/wiki/Proximal_policy_optimization PPO]=== | ===[https://en.wikipedia.org/wiki/Proximal_policy_optimization PPO]=== | ||
| + | |||
| + | ===[https://spinningup.openai.com/en/latest/algorithms/sac.html SAC]=== | ||

