Initialization failed when fitting a relatively simple RL model

WUUUUU · February 24, 2022, 4:28am

Hello,

I’ve been trying to fit an RL model similar to the one provided in this paper (Dynamic Interaction between Reinforcement Learning and Attention in Multidimensional Environments - ScienceDirect), the model can be found under “choice models” section. I’m getting an Error in sampler$call_sampler(c(args, dotlist)) : Initialization failed. error and I am not sure how to fix it.

In my model, there are 3 image columns with 3 features each (total of 9 features). Each feature has its own attention parameter, which I am trying to fit. Other than that, I want to fit a beta parameter for inverse temperature in softmax function, and a learning rate. For more information, please refer to the paper I provided above. Also, I’ve tried printing out things and it’s just not printing for some reason.

Thank you very much!

data {
int<lower = 0> num_cols; //number of image columns
int<lower = 0> num_features; // total number of features
int<lower = 0> num_trials; //number of trials
int<lower = 1> col_choice[num_trials]; //index of which arm was pulled
int<lower = 1> col_1_top_feature[num_trials]; // index of top feature of left column
int<lower = 1> col_1_mid_feature[num_trials]; // index of middle feature of left column
int<lower = 1> col_1_bottom_feature[num_trials]; // index of bottom feature of left column
int<lower = 1> col_2_top_feature[num_trials]; // index of top feature of middle column
int<lower = 1> col_2_mid_feature[num_trials]; // index of middle feature of middle column
int<lower = 1> col_2_bottom_feature[num_trials]; // index of bottom feature of middle column
int<lower = 1> col_3_top_feature[num_trials]; // index of top feature of right column
int<lower = 1> col_3_mid_feature[num_trials]; // index of middle feature of right column
int<lower = 1> col_3_bottom_feature[num_trials]; // index of bottom feature of right column
int<lower = 1> col_chosen_top_feature[num_trials]; // index of top feature of chosen column
int<lower = 1> col_chosen_mid_feature[num_trials]; // index of middle feature of chosen column
int<lower = 1> col_chosen_bottom_feature[num_trials]; // index of bottom feature of chosen column
int<lower = 0> result[num_trials]; //outcome of bandit arm pull
}

parameters {
real<lower = 0, upper = 1> alpha; //learning rate
real<lower=0> beta; //softmax parameter - inverse temperature
vector<lower = 0, upper = 1>[num_features] attention_params;
}

transformed parameters {
vector<lower=0, upper=1>[num_features] V[num_trials]; // value of each feature
vector<lower=0, upper=1>[num_cols] col_values[num_trials]; // value of each column
real delta[num_trials]; // prediction error

for (trial in 1:num_trials) {

//set initial V and delta for each trial
if (trial == 1) {
  //if first trial, initialize feature and column values to 0 
  for (c in 1:num_cols) {
    col_values[1, c] = 0;}
    
  for (f in 1:num_features) {
      V[1, f] = 0;
    }

} else {
  for (f in 1:num_features) {
      V[trial, f] = V[trial - 1, f];;
    }

//get column values by summing their respective features
col_values[trial, 1] = V[trial, col_1_top_feature[trial]] +
V[trial,col_1_mid_feature[trial]] + V[trial, col_1_bottom_feature[trial]];
col_values[trial, 2] = V[trial, col_2_top_feature[trial]] +
V[trial,col_2_mid_feature[trial]] + V[trial, col_2_bottom_feature[trial]];
col_values[trial, 3] = V[trial, col_3_top_feature[trial]] +
V[trial,col_3_mid_feature[trial]] + V[trial, col_3_bottom_feature[trial]];}

//calculate prediction error to update features V's 
delta[trial] = result[trial] - col_values[trial,col_choice[trial]];

//update feature V values based on prediction error (delta), learning rate (alpha), and attention
V[trial, col_chosen_top_feature[trial]] = V[trial, col_chosen_top_feature[trial]]  + alpha * delta[trial] * attention_params[col_chosen_top_feature[trial]];
V[trial, col_chosen_mid_feature[trial]] = V[trial, col_chosen_mid_feature[trial]]  + alpha * delta[trial] * attention_params[col_chosen_mid_feature[trial]];
V[trial, col_chosen_bottom_feature[trial]] =  V[trial, col_chosen_bottom_feature[trial]]  + alpha * delta[trial] * attention_params[col_chosen_bottom_feature[trial]];

}
}

model {
// priors
beta ~ gamma(2, 3);
alpha ~ gamma(2, 3);
attention_params[1] ~ gamma(2,3);
attention_params[2] ~ gamma(2,3);
attention_params[3] ~ gamma(2,3);
attention_params[4] ~ gamma(2,3);
attention_params[5] ~ gamma(2,3);
attention_params[6] ~ gamma(2,3);
attention_params[7] ~ gamma(2,3);
attention_params[8] ~ gamma(2,3);
attention_params[9] ~ gamma(2,3);

for (trial in 1:num_trials) {
//returns the probability of having made the choice you made, given your beta and your V’s
target += log_softmax(col_values[trial] * beta)[col_choice[trial]];
}
}

WUUUUU · February 24, 2022, 8:25pm

I wrote an earlier post that contained the wrong code and so I closed it and posted this one. Would still appreciate help on this post.

Thank you again!

Vanessa_Brown · May 13, 2022, 8:43pm

hi, not sure if you still need help but I can chime in since no one else has responded. do you have any more of the error output or is that all that was returned?

at first glance, it looks like the code to increment the log density is incorrect, see 7.4 Sampling statements | Stan Reference Manual. I think what you want would look something more like the following (note the use of categorical_logit, which includes the softmax transformation):

target += categorical_logit(col_choice[trial] | col_values[trial]*beta);

Topic		Replies	Views
Initialization failed error - Attempt to fit model using rstan Modeling rstan , techniques , fitting-issues	1	458	May 30, 2020
Simple reinforcement learning model fails to initialize - Reproducible Example Modeling rstan , fitting-issues , cognitive-science	1	576	May 29, 2020
Failure to start because of initial values Modeling	16	3532	July 31, 2017
Initialization failure in rstan Modeling	1	550	May 15, 2020
Failed initialization--New to STAN, not sure how to debug Modeling fitting-issues , specification	3	2084	November 25, 2017

Initialization failed when fitting a relatively simple RL model

Related topics