Difference between revisions of "Reinforcement Learning"
|  (Created page with "   ==Training algorithms==  ===A2C===  ===PPO===") |  (→PPO) | ||
| Line 6: | Line 6: | ||
| ===A2C=== | ===A2C=== | ||
| − | ===PPO=== | + | ===[https://en.wikipedia.org/wiki/Proximal_policy_optimization PPO]=== | 

