Trust Region policy optimization vs Proximal policy optimization
Actor critics, A2C, A3C
Explore Policy-based methods and dive into policy gradients
Fixed Q-targets, Double DQN, Dueling DQN, Prioritized Replay
Learn what Q Learning is and build a Deep Q Network to play games
The central idea behind reinforcement learning and an overview of its algorithms