Agent-environment interactions, exploration vs exploitation and the epsilon-greedy algorithm for playing with Gaussian bandits.
An introduction to reinforcement learning…
Agent-environment interactions, exploration vs exploitation and the epsilon-greedy algorithm for playing with Gaussian bandits.