Contents:
Coursework Template
Basics of the RLDurham package
Practical 2: Multi-Armed Bandits
Example
Practical 4: DynamicProgramming
Lecture 5: Monte Carlo Methods
Lecture 4: Dynamic Programming
Practical 3: Markov Decision Processes
Practical 6: Temporal Difference Learning
Lecture 2: Gym
Gallery generated by Sphinx-Gallery