I want to use the reinforcement learning_Rescorla Wagner model to explore how subjects learn the underlying 4 hierarchies. Left[nTrials] and right[n] are stimuli with potential hierarchies presented in pairs, and the subjects are asked to select the high-hierarchy stimuli (choice[nTrials]), if correct reward=1, incorrect reward=-1. But my model keeps getting errors. as follows:
‘Stan model ‘RL_RW’ does not contain samples.’
SAMPLING FOR MODEL ‘RL_RW’ NOW (CHAIN 1).
Chain 1: Rejecting initial value:
Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: categorical_logit_lpmf: categorical outcome out of support is 4, but must be in the interval [1, 2] (in ‘model21585021459_RL_RW’ at line 30)
This is my stan model code
data{
int<lower=1> nTrials;
int<lower=1,upper=4> left[nTrials];
int<lower=1,upper=4> right[nTrials];
int<lower=1,upper=4> choice[nTrials];
int<lower=-1,upper=1> reward[nTrials];
}
parameters{
real<lower=0,upper=1> alpha;
real<lower=0,upper=3> tau;
vector[4] V_4;
vector[2] V_2;
real pe_l;
real pe_r;
V_4=rep_vector(0,4);
for(t in 1:nTrials){
V_2[1]=V_4[left[t]];
V_2[2]=V_4[right[t]];
choice[t]~ categorical_logit(tau*V_2);
//value update
if((choice[t]==left[t] && reward[t]==1) || (choice[t]==right[t] && reward[t]==-1)){
pe_l=1-V_4[left[t]];
pe_r=-1-V_4[right[t]];
}else{
pe_l=-1-V_4[left[t]];
pe_r=1-V_4[right[t]];
}
V_4[left[t]]=V_4[left[t]]+alpha*pe_l;
V_4[right[t]]=V_4[right[t]]+alpha*pe_l;
}
}
}