Difference between revisions of "Reinforcement Learning"
(→A2C) |
(→Training algorithms) |
||
Line 7: | Line 7: | ||
===[https://en.wikipedia.org/wiki/Proximal_policy_optimization PPO]=== | ===[https://en.wikipedia.org/wiki/Proximal_policy_optimization PPO]=== | ||
+ | |||
+ | ===[https://spinningup.openai.com/en/latest/algorithms/sac.html SAC]=== |