Difference between revisions of "Reinforcement Learning"

@@ Line 4: / Line 4: @@
 ==Training algorithms==
-===A2C===
+===[https://en.wikipedia.org/wiki/Advantage_Actor_Critic A2C]===
 ===[https://en.wikipedia.org/wiki/Proximal_policy_optimization PPO]===

Revision as of 22:36, 24 April 2024