Setting starting values on ndt intercept of shifted lognormal model

I’m trying to fit a shifted lognormal model to a large data set of response times. I’m having some difficulty with trade-offs between the ndt parameter and the scale parameter (mu): sometimes in one chain the ndt starts out out in an unreasonable part of the parameter space, and just stays there.

I’m seeing whether I can mitigate this with some reasonable priors, I’m less familiar with how to specify things in brms than I would be in stan or JAGS. I’m trying to specify a prior (which seems to work) and starting values. When I specify the starting values for b_ndt as a function (see code below) the stan errors indicate problems with the starting values:

# SAMPLING FOR MODEL '48d47669f619bbb26676a523eb3d8c6e' NOW (CHAIN 1).
# Chain 1: Rejecting initial value:
# Chain 1:   Error evaluating the log probability at the initial value.
# Chain 1: Exception: lognormal_lpdf: Random variable[1] is -163.201, but must be >= 0!  (in 'model4b2c6ff13fd9_48d47669f619bbb26676a523eb3d8c6e' at line 118)

I don’t get this, given that Y is lognormally distributed and must be greater than 0, so I figure it must be some issue with the generation of starting values yielding incompatible values (?). I’d think that since everything is defined on the log scale this would work, but I guess I’m missing something.

Code follows.

## Read in the data
library(dplyr)
library(brms)

data_url = "https://www.dropbox.com/s/94ccglb6kd1ifm0/data.Rda?dl=1"
data_file = tempfile(fileext = ".Rda")

download.file(url = data_url,
              destfile = data_file,
              mode="wb")

load(data_file)

## brms
formula <- bf(LiftOffLatency ~ 0 + Intercept + CueDirection*SentenceDirection*duration + CueDirection*SentenceDirection*min_freq + (1|ptid) + (0+SentenceDirection|ItemNumber),
              sigma ~ 0 + Intercept + (1|ptid),
              ndt ~  0 + Intercept + (1|ptid), family=shifted_lognormal())

priors <- c(
  set_prior("normal(7,1)", class = "b", coef = "Intercept", dpar = "ndt")
)

fit = brm(formula = formula,
          data = mono_data,
          prior = priors,
          chains = 1,
          seed = 394,
          control = list(
            #adapt_delta = .95,
            #max_treedepth = 15
          ), inits = function(){
            list(b_ndt = array(runif(1,6,8)))
          })

Please also provide the following information in addition to your question:

  • Operating System: Windows and OSX
  • brms Version: 2.13.3

Welcome in the Stan forum!

Did you have a look at the Stan code to check what link function brms uses for the lognormal?

(Brms generated Stan code for complex models can be hard to parse. If this is the case, it can help to start with a simpler model).

If it is the identity link function, as this discussion here Understanding shifted lognormal parameters and priors suggests, that could be part of the problem.

Also:
If I remember correctly, brms uses flat priors for regression coefficients other than the intercept if not specified otherwise. To get a better behaved model you could put priors on the other regression coefficients.
It’s also not necessary to use ‘0 + Intercept’. You can just use ‘1 +’ … and specify a prior for the intercept.

In addition to what Guido said, models with predicted ndt are hard to initialize since the ndt must be initialized so that it is smaller than the corresponding responses. Please note that, by default, ndt is modelled on the log-scale if predicted. In my experience, setting the initial values of the ndt intercept to some small(ish) values, e.g., -3 or -5 on the log scale should suffice. This could be done as follows:

inits <- list(list(b_ndt = as.array(-5)))

The length of the outer list must match the number of chains. Of course, the function approach you are doing could work as well.

5 Likes

Hi,

Did you get this to work? What was the solutions if so?

I tried Paul’s suggestion and had no luck.

hi,

Do I need an up-to-date brms for this to work? (I am current a few versions behind (2.13.5), as I didn’t want to break my current projects when updating to RTools4, etc).

I’m having very similar problems to Richard, … it feels like 2 out of 3 times, the chains run fine and I get lovely results. But other times, things just fall over!

You should be fine with your version although I would appreciate you being more specific about your version number. The problem is simply with and ndt that is predicted in a distributional regression and I do not have a good solution for this problem yet.

Sure - how do I find the more specific version number information? What information is required beyond v2.13.5? Sorry for my ignorance!

I am sorry, I misread your message. I thought you said that you are a few versions behind 2.13.5 but now that I reread your message I see that you are on 2.13.5. Please disregard my former post.

no worries, easy done… (you do an excellent job in support… if i ever meet you in real life, I clearly owe you a beer (or equivalent of your choice!).

Thinking about my problems, I wonder if the issue is the cauchy distributions I am using for priors for the random intercepts… these have a fat tail right? So perhaps there are a load of cases in which a large value for ndt is generated via ndt ~ (1 | pid) and a low value from the rt ~ … + (1| pid).

Just thinking out loud here!

My current priors and model code:

myp ← c(
prior_string(“normal(-1.4, 0.2)”, class = “b”, coef = intercepts),
prior_string(“normal(0, 0.2)”, class = “b”, coef = slopes),
prior_string(“normal(-1, 0.5)”, class = “Intercept”, dpar = “ndt” ),
prior_string(“cauchy(0, 0.4)”, class = “sigma”),
prior_string(“cauchy(0, 0.1)”, class = “sd”),
prior_string(“cauchy(0, 0.1)”, class = “sd”, dpar = “ndt”))

m ← brm(
bf(
rt ~ 0 + d_feature + d_feature:log(N_T+1) + (1|p_id),
ndt ~ 1 + (1|p_id)),
family = shifted_lognormal(),
data = sample_frac(d,0.1),
prior = myp,
chains = 1,
iter = 3000,
inits = function(){
list(b_ndt = array(-5))
},
control =list(adapt_delta = 0.9)
)

Actually, while writing this out, this actually all worked! (In my latest run, I decreased the two sd cauchy priors to cauchy(0, 0.1). So perhaps my much was right, or perhaps it was a lucky coincidence. [It still rejected one initial start value, but then was fine…]

I’ve just set this running on my whole dataset (rather than only 10%). This time, it is having to reject a lot of start values to get started.

I feel like I’m very close to getting this working… it works sometimes, and when it does, the model fit is clearly a lot better than the other models we’ve been trying. But it’s not yet robust enough for the paper that we’re writing!

image

Great. With regard to the starting values, I recomment that you set starting values for the Intercept, which is not part of the “b_ndt” vector. I think using

list(Intercept_ndt = array(-5))

should be correct, but please check in the generated Stan code (via make_stancode) if Intercept_ndt is indeed the correct intercept name in your brms version.

Thanks!

Do you mean

inits = list(list(Intercept_ndt = array(-5)))

though? While Intercept_ndt looks correct:

parameters {
vector[K] b; // population-level effects
real<lower=0> sigma; // residual SD
real Intercept_ndt; // temporary intercept for centered predictors
vector<lower=0>[M_1] sd_1; // group-level standard deviations
vector[N_1] z_1[M_1]; // standardized group-level effects
vector<lower=0>[M_2] sd_2; // group-level standard deviations
vector[N_2] z_2[M_2]; // standardized group-level effects
}

We get some nasty error messages:

myp ← c(
prior_string(“normal(-1.4, 0.2)”, class = “b”, coef = intercepts),
prior_string(“normal(0, 0.2)”, class = “b”, coef = slopes),
prior_string(“normal(-1, 0.5)”, class = “Intercept”, dpar = “ndt” ),
prior_string(“cauchy(0, 0.4)”, class = “sigma”),
prior_string(“cauchy(0, 0.1)”, class = “sd”),
prior_string(“cauchy(0, 0.1)”, class = “sd”, dpar = “ndt”))

m ← brm(
bf(
rt ~ 0 + d_feature + d_feature:log(N_T+1) + (1|p_id),
ndt ~ 1 + (1|p_id)),
family = shifted_lognormal(),
data = d,
prior = myp,
chains = 1,
iter = 3000,
inits = list(list(Intercept_ndt = array(-5))),
control =list(adapt_delta = 0.9)
)

SAMPLING FOR MODEL ‘10349aea9f518cbea357ca757468a402’ NOW (CHAIN 1).
Chain 1: Unrecoverable error evaluating the log probability at the initial value.
Chain 1: mismatch in number dimensions declared and found in context; processing stage=parameter initialization; variable name=Intercept_ndt; dims declared=(); dims found=(1)
[1] “Error in sampler$call_sampler(args_list[[i]]) : "
[2] " mismatch in number dimensions declared and found in context; processing stage=parameter initialization; variable name=Intercept_ndt; dims declared=(); dims found=(1)”
[1] “error occurred during calling the sampler; sampling not done”

Sorry, it should just be list(Intercept_ndt = -5) , that is without array(). Whether you need another list() wrapper depends on whether you specify inits via list if lists() on per chain or via a function.

1 Like

Thanks! (I assume the ndt = 5 should have a minus)

This is now running, and there were no rejected values. So fingers crossed this all works reliably.

My colleague is running the same model, (although with 4 chains) and is getting a few rejected values. Is this something that is to be expected, and can probably be ignored? Or should she try some other values? (I think she tried -10, and that gave 1 rejected value over her four chains).

Do you have a paper out on fitting these distributions that you’d like cited? It’s the least we can do for your help!

You can ignore the reject values. Right now, I do not have a paper to be cited specifically except for the standard brms papers of course.