Difference between revisions of "Reinforcement Learning"
(→PPO) |
(→A2C) |
||
Line 4: | Line 4: | ||
==Training algorithms== | ==Training algorithms== | ||
− | ===A2C=== | + | ===[https://en.wikipedia.org/wiki/Advantage_Actor_Critic A2C]=== |
===[https://en.wikipedia.org/wiki/Proximal_policy_optimization PPO]=== | ===[https://en.wikipedia.org/wiki/Proximal_policy_optimization PPO]=== |