[RL basics] Week 1. RL Intro
This page presents the basics of RL. It might be usefull to refresh RL or to explain it to newcommers.
This page presents the basics of RL. It might be usefull to refresh RL or to explain it to newcommers.
An agent learns from the environment by interacting with it through trials and errors and receives rewards as feedback.
To have the best behavior, we need to maximize the expected cumulative reward.
Agent takes decision using only current state (no memory)
Complete (full info)
Partial
Episodic (terminal state)
Continuous (no end)
Discrete (finite)
Continuous
Tradeoff between usual actions and new unknown experience.
👌 Use stochastic policy (probability distribution of actions)
Which action to take given current state?
Which state has the highest value?