2.1 Reinforcement Learning

2.2 Multi-Agent Settings

$$ G=<S,U,P,r,Z,O,n,\gamma> $$

2.3 Centralized vs Decentralized Control

Centralized Control

$$ \pi^C(u|s_t): U\times S\rightarrow [0,1] $$

Decentralized Control