Chapter Summaries
1. INTRODUCTION
2. BACKGROUND
3. Counterfactual Multi-Agent Policy Gradients
Reference
Benchmarking
- MAPPO (대세인듯으로 보임.)
- QMIX
- BENCHMARKING MULTI-AGENT DEEP REINFORCEMENT LEARNING ALGORITHMS (link)

- MARL aglorithms supported by Rllib

MAPPO (1)