Stan code for fitting simple RL model

lyoon · June 9, 2018, 5:05pm

Dear. Stan Users

Hi, I am trying to estimate learning rates of Reinforcement Learning model using Stan. The Reinforcement Learning model is somewhat modified from traditional one for my experimental task.

The equation of my model is :
SV(t)=𝑺𝑽(𝒕−𝟏)+𝒂∗(𝑭𝑩(𝒕)−𝑺𝑽(𝒕−𝟏)) if trial > 1
SV(t)=𝒂∗𝑭𝑩(𝒕) if trial = 1
The equation for the decision probability is :
Logit[P(Decision)] = β(SV) xSV + β(OQ) xOQ+ c

I tried to make code with stan code of others who used RL model (but not exactly same one with mine) but I do not have confident about this code and it gives error like :
Too many indexes for variable; variable name = PE; num dimensions given = 2; variable array dimensions = 0
error in ‘modelfe43c8a760b_Only3Subjects’ at line 65, column 16

63:       // prediction error 
64:       if(t == 1){
65:       PE[i,t] = feedback[i,t] - 0; 
                   ^
66:       } else {

Maybe there are other errors other than this one. I am not that familiar to R or Stan and instead more familiar to Matlab so that you may see some confusing parts in this code.
Below is the full code of mine. I would be greatly appreciated if anyone provides wrong parts and how I can deal with them.
Thank you so much.

This data is for 3 individuals who have 75 trials each.

// model with alpha and beta
data {
  int<lower=1> N; // number of subjects
  int<lower=1> T; // number of trials
  int<lower=1> Tsubj[N]; // number of trials per usbject    
  int<lower=-1,upper=1> choice[N,T]; // decision data
  real<lower=-1, upper=1> feedback[N,T];  // feedback data
  int<lower=1, upper=5> OQ[N,T]; // data of the variable named OQ other than feedback
}

transformed data {
  vector[1] initV;  // initial value for SV (Self-Value or Stimulus Value)
  initV = rep_vector(0.0, 1);
}

parameters {
// Declare all parameters as vectors for vectorizing
  // Hyper(group)-parameters  
  vector[3] mu_p;  
  vector<lower=0>[3] sigma;
    
  // Subject-level raw parameters (for Matt trick)
  vector<lower=0, upper=1>[N] alpha_pr;    // learning rate [0, 1]
  vector[N] beta_FB_pr;  // inverse temperature for feedback (outcome)
  vector[N] beta_OQ_pr; // inverse temperature for OQ
}

transformed parameters {
  // subject-level parameters
  vector<lower=0,upper=1>[N] alpha;
  vector[N] beta_FB; // for FB
  vector[N] beta_OQ; // for OQ
   
  for (i in 1:N) {
    alpha[i] = Phi_approx( mu_p[1] + sigma[1] * alpha_pr[i] );
    beta_FB[i] = Phi_approx( mu_p[2] + sigma[2] * beta_FB_pr[i] ) * 5;
    beta_OQ[i] = Phi_approx( mu_p[3] + sigma[3] * beta_OQ_pr[i] ) * 5;
  }
}

model {
  // Hyperparameters
  mu_p  ~ normal(0, 1); 
  sigma ~ normal(0, 1);  
  
  // individual parameters
  alpha_pr ~ normal(0,1);
  beta_FB_pr ~ normal(0,1);
  beta_OQ_pr ~ normal(0,1);
    
  // subject loop and trial loop
  for (i in 1:N) {
    vector[1] sv; // self-value (stimulus value)
	real PE;      // prediction error
	

    sv = initV;
    
   
    for (t in 1:(Tsubj[i])) {
    	
    	    
      // prediction error 
      if(t == 1){
      PE[i,t] = feedback[i,t] - 0; 
      } else {
      PE[i,t] = feedback[i,t] - sv[i,t-1]; 
      }

      // value updating (learning) 
      if(t == 1){
      	sv[i,t] = alpha[i]*feedback[i,t]; 
      	} else {
      	sv[i,t] = sv[i,t-1]+alpha[i]*PE[i,t];
      	}
      
      // compute decision probabilities
      choice[i,t] ~ categorical_logit( beta_FB[i]*sv + beta_OQ[i]*OQ ); 
     
	
    }
  }
}

generated quantities {
}

To include mathematical notation in your post put LaTeX syntax between two $ symbols, e.g.,
p(\theta | y) \propto p(\theta) p(y | \theta).

Bob_Carpenter · July 20, 2018, 2:50pm

The probelm is that PE is being declared as a scalar but is being used as a 2D array. If it’s just a local variable, then you probably just want:

real PE = t == 1 ? feeback[i, t] : feedback[i, t] - sv[i, t-1];

shen_zhang · December 12, 2018, 8:54pm

Hi, lyoon,
Does your code work now?
I’m trying to fit a RL model with stan, but I got
the “divergent after warmup” warning,
And I try the transform in your code:
alpha[i] = Phi_approx( mu_p[1] + sigma[1] * alpha_pr[i] );
beta_FB[i] = Phi_approx( mu_p[2] + sigma[2] * beta_FB_pr[i] ) * 5;
beta_OQ[i] = Phi_approx( mu_p[3] + sigma[3] * beta_OQ_pr[i] ) * 5;
and it doesn’t work.

Topic		Replies	Views
Stan model 'RL_RW' does not contain samples Modeling rstan	7	650	June 17, 2022
Bugs from A dynamic reinforcement learning model Modeling cognitive-science	7	663	January 6, 2020
Issues with running Bayesian hierarchical reinforcement learning model in Stan Modeling rstan , fitting-issues , specification , performance	12	827	June 28, 2021
Repeated measure hierarchical reinforcement learning with groups (2 x 2 x 2 design) Modeling rstan , fitting-issues , hierarchical-model , model-comparison , cognitive-science	7	2015	July 28, 2020
How to declare a variable in model block Modeling	24	4537	August 12, 2018

Stan code for fitting simple RL model

Related topics