Difference between revisions of "Reinforcement Learning"

From Humanoid Robots Wiki
Jump to: navigation, search
(PPO)
(A2C)
Line 4: Line 4:
 
==Training algorithms==
 
==Training algorithms==
  
===A2C===
+
===[https://en.wikipedia.org/wiki/Advantage_Actor_Critic A2C]===
  
 
===[https://en.wikipedia.org/wiki/Proximal_policy_optimization PPO]===
 
===[https://en.wikipedia.org/wiki/Proximal_policy_optimization PPO]===

Revision as of 22:36, 24 April 2024


Training algorithms

A2C

PPO