Changes

Allen's Reinforcement Learning Notes

21 bytes added, 20:09, 24 May 2024

no edit summary

Consider a problem where we have to train a robot to pick up some object. A traditional ML algorithm might try to learn some function f(x) = y, where given some position x observed via the camera we output some behavior y. The trouble is that in the real world, the correct grab location is some function of the object and the physical environment, which is hard to intuitively ascertain by observation.

The motivation behind reinforcement learning is to repeatedly take observations, then sample the effects of actions on those observations (reward and new observation/state). Ultimately, we hope to create a policy <math>pi </math> that maps states or observations to optimal actions.

=== Learning ===

Allen12

53

edits

Humanoid Robots Wiki β

Changes

Allen's Reinforcement Learning Notes

Humanoid Robots Wiki ^β