[RL basics] Week 2. Q-learning

Policy and Value-function • State-Value and State-Action • Bellman Equation •  Monte Carlo & Temporal Difference