Difference between revisions of "Reinforcement Learning"
(→Training algorithms) |
|||
| Line 1: | Line 1: | ||
| + | == Training algorithms == | ||
| + | * [https://en.wikipedia.org/wiki/Advantage_Actor_Critic A2C] | ||
| + | * [https://en.wikipedia.org/wiki/Proximal_policy_optimization PPO] | ||
| + | * [https://spinningup.openai.com/en/latest/algorithms/sac.html SAC] | ||
| + | == Resources == | ||
| − | + | * [https://mandi-zhao.gitbook.io/deeprl-notes Mandy Zhao's Reinforcement Learning Notes] | |
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||