Hello,
I encountered convergence problems when fitting a hierarchical drift diffusion model to reaction time data (Rhat>1, see below). In order to resolve the problem i initially thought that it might help to

supply proper initial values (e.g. values provided by a corresponding MLEfit)

use a (strong) informative prior on the parameter v_{in} (the parameter causing the problems)

change the adapt and maximum treedepth setting when calling stan().
However, none of which settled the problem. I should note, however, that the problem is solely caused by the introduction of the v_{in} parameter (across trial variability in drift rates). That is, once I fit a corresponding model with v_{in}=0, everything works fine. This seems very puzzling, because the parameter is necessary to model differences in the response times for error and correct trials. As the data show substantial differences in the corresponding response times, i would think that v_{in} should not be poorly identified.
I would be very grateful for any help and suggestions.
data {
int<lower=1> N; // total number of observations
int<lower=1> n; //number of subjects
vector[N] Y; // response times
int<lower=0,upper=1> dec[N]; // decisions zero/one coded
int<lower=1> k; // number of distinct conditions (experimental + stimulus)
int<lower=1> ncue; // number of distinct experimental conditions
matrix[n,k] ob; // upper bounds for nondecision times
int<lower=1> J_1[N]; // to which subject does the observation belong to?
int<lower=1> I_1[N]; // to which condition does the observation belong to?
int<lower=1> g[N]; // to which experimental condition does the observation belong to?
}
parameters {
matrix[n,k] theta; // drift (mean) parameters; 1 per subject x condition
vector[N] drifts; // drift rates for each trial
matrix<lower=0>[n,ncue] a; // boundary parameter; 1 per subject x experimental condition
matrix<lower=0, upper=1>[n,k] ndt_raw; // non decision times as a proportion of the given upper bounds
vector[k] v_mu; // group level mean drift rate
vector<lower=0>[k] v_sig; // variation of drift rates on the group level
vector<lower=0>[ncue] a_mu; // group level mean boundary separation
vector<lower=0>[ncue] a_sig; // variation of boundary separation on the group level
real<lower=0> v_in; // variation of drift rates across trials; causes Rhat>1
}
transformed parameters{
matrix[n,k] ndt=ndt_raw .* ob;
}
model {
v_in ~ gamma(4,3);
for (l in 1:n){
for(j in 1:k){
ndt_raw[l,j] ~ beta(1, 1);
}}
for (l in 1:n) {
for(j in 1:k){
theta[l,j] ~ normal(v_mu[j],v_sig[j]);
}}
for (l in 1:n){
for(j in 1:ncue){
a[l,j] ~ normal(a_mu[j],a_sig[j])T[0,];
}}
for (j in 1:k) {
v_mu[j] ~ normal(0, 1);
v_sig[j] ~ inv_gamma(4, 3);
}
for (j in 1:ncue) {
a_mu[j] ~ inv_gamma(4, 3);
a_sig[j] ~ inv_gamma(4, 3);
}
for (j in 1:N) {
drifts[j] ~ normal(theta[J_1[j],I_1[j]],v_in); // v_in enters the model
if(dec[j]==1)
{Y[j] ~ wiener(a[J_1[j],g[j]],ndt[J_1[j],I_1[j]],0.5,drifts[j]);}
else{Y[j] ~ wiener(a[J_1[j],g[j]],ndt[J_1[j],I_1[j]],0.5,drifts[j]);}
}
}
Output example:
Output:
1: There were 14 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
http://mcstan.org/misc/warnings.html#maximumtreedepthexceeded
2: There were 3 chains where the estimated Bayesian Fraction of Missing Information was low. See
http://mcstan.org/misc/warnings.html#bfmilow
$summary
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
v_in 0.5369669 0.02392339 0.07211375 0.2639275 0.5273514 0.5538154 0.5723473 0.6135418 9.086373 1.291439