Blog post about a Q-learning model of reinforcement learning (RL)

bnicenboim · December 6, 2021, 9:18am

I wrote a blog post about a type of reinforcement learning (RL) called Q-learning with simulations of human data and non-hierarchical and hierarchical Bayesian models in Stan here.

I simulated rewards from a restless 2-arm bandit, basically 2 slot machines that change their rewards along the trials. Then I simulated one subject that use Q-learning to choose an arm along the trials and fit the data and recovered the parameters.
Finally I simulated some subjects with a hierarchical structure and again I fit a Bayesian model and recovered the parameters.

I hope it’s useful! I did it mainly to learn more about RL, and I’m not an expert so feel free to tell me if I messed it up somewhere.

Bruno

Topic		Replies	Views
Two-armed bandit hierarchical reinforcement learning model - interpreting conflicting loo and posterior predictive check results Modeling specification , loo , posterior-predictive , hierarchical-model , reinforcement-learning	7	644	January 17, 2024
Two-armed bandit hierarchical reinforcement learning model - simplify to random intercept only Modeling rstan , specification , mixed-model , reinforcement-learning	3	471	February 12, 2024
Asking for advice on making a more efficient RL code Modeling	4	512	July 17, 2021
Convergence issues in multilevel Q-learning model (epsilon-greedy) Modeling fitting-issues , cognitive-science	2	832	January 30, 2020
Model with two linked outcome variables Modeling techniques	1	400	July 26, 2022

Blog post about a Q-learning model of reinforcement learning (RL)

Related topics