Dear Stan community,
I’m trying to fit a multi-level drift diffusion model (DDM) based on the Wiener first passage time distribution (here in the stan documentation; here in the stan math library).
In my task, the correct response for participants to give is sometimes a NoGo response, which yields no reaction time (RT). The way to deal with this is to integrate over RTs for the NoGo boundary, for which an approximation is given in Blurton, Kesselmeier, & Gondan (2012, Journal of Mathematical Psychology), see especially the appendix for R and Matlab implementations. I use a a simplification that has been suggested on the HDDM mailing list (HDDM is a Python wrapper package that fits DDMs in PyMC). This simplification is for example implemented here in the prob_ub function which is then called here in the wiener_like_multi function. Because I treat NoGo as the lower boundary, I compute 1 minus the probability of reaching the upper boundary.
I’ve tried this implementation myself on a simple data set (only 1 participant) simulated from the RWiener package and then fitted using the maximum likelihood estimation via optim(). Given the loss of information due to missing RTs, it works reasonably well imo.
I’ve now tried to implement this code in stan as well as a custom-likelihood function. I get reasonable parameter estimates, but no log-likelihoods. Also, I see reasons why it my blow up for larger data sets and more complicated model versions. My questions are
- Is this definition formally correct (or am I somehow misinterpreting the way custom functions in stan should be defined)?
- While I get sensible parameters (compared to ground truth and to parameters estimated with MLE), it somehow does not compute log-likelihood (log_lik), but I get NaNs there—even though I use the exact same command as for incrementing the target distribution… What am I do wrong here?
- This is a simple toy example with little data, but I’m afraid that it might blow up on bigger data sets (say 35 people, 250 data points per subject, so 8750 data points in total). I’m particularly concerned
a. whether my way of just taking the log at the end is the most efficient and suitable way;
b. what happens if for NoGo responses, one of the parameters is 0, so I get log(0) for the log-likelihood. I now just added machine_precision(); is there a better way to do this?
See my implementation below:
library(RWiener)
library(rstan)
## Define data settings:
nTrial <- 1000
par <- c(1, 0.2, 0.75, 0.5) # parameters alpha, tau, beta, delta
## Simulate data:
set.seed(0)
dat <- rwiener(n = nTrial, alpha = par[1], tau = par[2], beta = par[3], delta = par[4]) # simulate data
## Define custom deviance function to allow for NoGo responses:
wiener_deviance_gonogo <- function(x,data){
#' Return deviance for Go/NoGo data. NoGo at lower boundary.
#' x: vector with the four parameter values: alpha, tau, beta, delta
#' data: dataframe with data. Needs a reaction time column \param{q} with RTs in seconds
#' and a accuracy/response column \param{resp} with values "upper" or "lower".
nData <- nrow(data)
devSum <- 0
## Loop through data:
for (iData in 1:nData){
dat <- data[iData,] # select trial
if (dat$resp=="upper"){ # if Go:
myDeviance <- wiener_deviance(x,dat)
} else { # if NoGo:
alpha <- x[1]; beta <- x[3]; delta <- x[4] # extract parameters
lik <- (1 - (exp(-2*alpha*beta*delta) - 1) / (exp(-2*alpha*delta) - 1))
myDeviance <- -2*log(lik) # convert to deviance
}
devSum <- devSum + myDeviance # sum up deviance
}
return(devSum)
}
## Fit with MLE via optim():
fitMLE <- optim(c(1, .1, .1, 1), wiener_deviance, dat=dat, method="Nelder-Mead") # fit based on all data
# fitMLE$par
# [1] 0.9803655 0.1984600 0.7304328 0.5774809
fitMLE <- optim(c(1, .1, .1, 1), wiener_deviance_gonogo, dat=dat, method="Nelder-Mead") # fit only on upper bound (Go) RTs
fitMLE$par
# [1] 0.9444393 0.1980153 0.7139110 0.6735721
# Reasonable fit given that ground truth par was
# 1, 0.2, 0.75, 0.5
## Prepare data for stan:
standata <- list(y = dat$q,
resp = dat$resp,
N = length(dat$resp),
RTbound = 0.1,
minRT = min(dat$q))
if (is.factor(standata$resp)){standata$resp <- ifelse(standata$resp == "upper", 1, 0)}
standata$y <- ifelse(standata$resp == 1, standata$y, 0) # set RTs on NoGo trials to 0 # Set RTs for NoGo to 0
And now in stan:
wiener_model_gonogo <- "
functions {
/* returns Wiener First Passage Time Distribution if resp == 1 (Go), i.e., when RT is available
* otherwise integrates over RTs (NoGo) based on solution in Blurton, Kesselmeier, & Gondan (2012, Journal of Mathematical Psychology)
* see also discussion under https://groups.google.com/g/hddm-users/c/3hlU6RnNcrw ,
* and implementation in function prob_ub in HDDM: https://github.com/hddm-devs/hddm/blob/master/src/pdf.pxi ,
* which is called in function wiener_like_multi in HDDM: https://github.com/hddm-devs/hddm/blob/master/src/wfpt.pyx .
* Args:
* y : vector or reaction times (in seconds)
* resp : vector or responses, either 1 (Go) or 0 (NoGo)
* alpha: Boundary separation
* tau : Nondecision time
* beta : A-priori bias
* delta: Drift rate
* Returns:
* log probability density for RT/resp given parameters
*/
real wiener_gonogo_lpdf(real y, real resp, real alpha, real tau, real beta, real delta) {
if (resp == 1) { // Go: just use wiener_lpdf
return wiener_lpdf(y | alpha, tau, beta, delta);
} else if (delta == 0) { // NoGo and drift == 0
return log(1 - beta + machine_precision()); // just return starting point bias (1 minus because lower boundary)
} else { // NoGo and drift != 0
return log(1 - (exp(-2 * alpha * beta * delta) - 1) / (exp(-2 * alpha * delta) - 1) + machine_precision()); // analytical solution for integration (1-p because lower boundary)
}
}
}
data {
int<lower=0> N; // number of data points
real resp[N]; // choices (1=upper bound, 0=lower bound)
real y[N]; // RTs (in sec.)
real minRT; // minimum RT of the observed data
real RTbound; // lower bound or RT (e.g., 0.1 second)
}
parameters {
real<lower=0, upper=5> alpha; // boundary separation
real<lower=RTbound, upper=minRT> tau; // nondecision time
real<lower=0, upper=1> beta; // initial bias
real delta; // drift rate
}
model {
alpha ~ uniform(0, 5);
tau ~ uniform(RTbound, minRT);
beta ~ uniform(0, 1);
delta ~ normal(0, 2);
for (n in 1:N) {
target += wiener_gonogo_lpdf(y[n] | resp[n], alpha, tau, beta, delta);
}
}
generated quantities {
// For log likelihood calculation
real log_lik;
for (n in 1:N) {
log_lik += wiener_gonogo_lpdf(y[n] | resp[n], alpha, tau, beta, delta);
}
}
"
And now fitting:
fit <- stan(data = standata, model_code = wiener_model_gonogo, chains = 4, iter = 4000, warmup = 1000)
# takes about 15 min. to fit
# Gives the following warning:
# : In validityMethod(object) :
# The following variables have undefined values: log_lik. Many subsequent functions will not work correctly.
print(fit)
Inference for Stan model: 3eba659d8f1a432623da6b89532cbcfb.
4 chains, each with iter=4000; warmup=1000; thin=1;
post-warmup draws per chain=3000, total post-warmup draws=12000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
alpha 0.95 0.00 0.03 0.89 0.93 0.94 0.96 1.00 4644 1
tau 0.20 0.00 0.00 0.20 0.20 0.20 0.20 0.20 5234 1
beta 0.71 0.00 0.02 0.68 0.70 0.71 0.72 0.74 4096 1
delta 0.68 0.00 0.12 0.46 0.60 0.68 0.76 0.91 4820 1
log_lik NaN NA NA NA NA NA NA NA NaN NaN
lp__ 396.35 0.02 1.39 392.88 395.67 396.67 397.37 398.10 4632 1
Samples were drawn using NUTS(diag_e) at Fri Apr 16 00:54:02 2021.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
# Parameter values look the same as with MLE; but why is log_lik NaN?!?
Thanks in advance for any help!