Hi Everyone,

I am new to stan and trying to bring the magic to my team. We wrote a simple code to estimate two parameters for a Q-learner. Here, a player is asked to chose between 1 to N cards (or arms), each leading to reward on some drifting probability. The model has two parameters (learning rate and a softmax temperature). While the model and parameters recover well (>.9 corr between recovered and true parameters), we need to fit this to big data, and it seem to be painfully slow (up to 15 hours or more).

Would you have any advice on how to write a better optimized code? Is there anything I can code differently that will help this run faster? Any advice would be really fantastic. We got so far on our own, but I canâ€™t really find a way to get this to be more efficient.

Thanks a million,

Nitzan

```
//General fixed parameters for the experiment/models
int<lower = 1> Nsubjects; //number of subjects
int<lower = 1> Ntrials; //number of trials without missing data, typically the maximum of trials per subject
int<lower = 1> Ntrials_per_subject[Nsubjects]; //number of trials left for each subject after data omission
//Behavioral data
int<lower = 0> action[Nsubjects,Ntrials]; //index of which arm was pulled coded 1 to 4
int<lower = 0> reward[Nsubjects,Ntrials]; //outcome of bandit arm pull
}
transformed data{
int<lower = 1> Nparameters; //number of parameters
int<lower = 2> Narms; //number of overall alternatives
Nparameters=2;
Narms =4;
}
parameters {
//population level parameters
vector[Nparameters] mu; //vector with the population level mean for each model parameter
vector<lower=0>[Nparameters] tau; //vector of random effects variance for each model parameter
cholesky_factor_corr[Nparameters] L_Omega; //lower triangle of a correlation matrix to be used for the random effect of the model parameters
//subject level parameters
vector[Nparameters] auxiliary_parameters[Nsubjects];
}
transformed parameters {
//population level
matrix[Nparameters,Nparameters] sigma_matrix;
//individuals level
real alpha[Nsubjects];
real beta[Nsubjects];
//additional variabels
real log_lik[Nsubjects,Ntrials];
vector<lower=0, upper=1>[Narms] Qcard;
//pre-assignment
sigma_matrix = diag_pre_multiply(tau, (L_Omega*L_Omega')); //L_Omega*L_omega' give us Omega (the corr matrix).
sigma_matrix = diag_post_multiply(sigma_matrix, tau); // diag(tau)*omega*diag(tau) gives us sigma_matirx (the cov matrix)
log_lik=rep_array(0,Nsubjects,Ntrials);
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
for (subject in 1:Nsubjects){
alpha[subject] = inv_logit(auxiliary_parameters[subject][1]);
beta[subject] = exp(auxiliary_parameters[subject][2]);
//pre-assignment of Qvalues
for (a in 1:Narms) Qcard[a] = 0;
//trial by trial loop
for (trial in 1:Ntrials_per_subject[subject]){
//liklihood function (softmax)
log_lik[subject,trial]=log_softmax(Qcard*beta[subject])[action[subject,trial]];
//Qvalues update
Qcard[action[subject,trial]] += alpha[subject] * (reward[subject,trial] - Qcard[action[subject,trial]]);
}
}
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
}
model {
// population level priors (hyper-parameters)
mu ~ normal(0, 5); // mu is a vector 1xNparameters with the population mean (i.e., location) for each model parameter
tau ~ cauchy(0, 1); //tau is the hyperparameters variance vector
L_Omega ~ lkj_corr_cholesky(2); //L_omega is the lower triangle of the correlations. Setting the lkj prior to 2 means the off-diagonals are priored to be near zero
// indvidual level priors (subject parameters)
auxiliary_parameters ~ multi_normal(mu, sigma_matrix);
target += sum(log_lik);
}
```