Setting starting values on ndt intercept of shifted lognormal model

richarddmorey · August 9, 2020, 12:58pm

I’m trying to fit a shifted lognormal model to a large data set of response times. I’m having some difficulty with trade-offs between the ndt parameter and the scale parameter (mu): sometimes in one chain the ndt starts out out in an unreasonable part of the parameter space, and just stays there.

I’m seeing whether I can mitigate this with some reasonable priors, I’m less familiar with how to specify things in brms than I would be in stan or JAGS. I’m trying to specify a prior (which seems to work) and starting values. When I specify the starting values for b_ndt as a function (see code below) the stan errors indicate problems with the starting values:

# SAMPLING FOR MODEL '48d47669f619bbb26676a523eb3d8c6e' NOW (CHAIN 1).
# Chain 1: Rejecting initial value:
# Chain 1:   Error evaluating the log probability at the initial value.
# Chain 1: Exception: lognormal_lpdf: Random variable[1] is -163.201, but must be >= 0!  (in 'model4b2c6ff13fd9_48d47669f619bbb26676a523eb3d8c6e' at line 118)

I don’t get this, given that Y is lognormally distributed and must be greater than 0, so I figure it must be some issue with the generation of starting values yielding incompatible values (?). I’d think that since everything is defined on the log scale this would work, but I guess I’m missing something.

Code follows.

## Read in the data
library(dplyr)
library(brms)

data_url = "https://www.dropbox.com/s/94ccglb6kd1ifm0/data.Rda?dl=1"
data_file = tempfile(fileext = ".Rda")

download.file(url = data_url,
              destfile = data_file,
              mode="wb")

load(data_file)

## brms
formula <- bf(LiftOffLatency ~ 0 + Intercept + CueDirection*SentenceDirection*duration + CueDirection*SentenceDirection*min_freq + (1|ptid) + (0+SentenceDirection|ItemNumber),
              sigma ~ 0 + Intercept + (1|ptid),
              ndt ~  0 + Intercept + (1|ptid), family=shifted_lognormal())

priors <- c(
  set_prior("normal(7,1)", class = "b", coef = "Intercept", dpar = "ndt")
)

fit = brm(formula = formula,
          data = mono_data,
          prior = priors,
          chains = 1,
          seed = 394,
          control = list(
            #adapt_delta = .95,
            #max_treedepth = 15
          ), inits = function(){
            list(b_ndt = array(runif(1,6,8)))
          })

Please also provide the following information in addition to your question:

Operating System: Windows and OSX
brms Version: 2.13.3

Guido_Biele · August 9, 2020, 6:36pm

Welcome in the Stan forum!

Did you have a look at the Stan code to check what link function brms uses for the lognormal?

(Brms generated Stan code for complex models can be hard to parse. If this is the case, it can help to start with a simpler model).

If it is the identity link function, as this discussion here Understanding shifted lognormal parameters and priors suggests, that could be part of the problem.

Also:
If I remember correctly, brms uses flat priors for regression coefficients other than the intercept if not specified otherwise. To get a better behaved model you could put priors on the other regression coefficients.
It’s also not necessary to use ‘0 + Intercept’. You can just use ‘1 +’ … and specify a prior for the intercept.

paul.buerkner · August 10, 2020, 7:19am

In addition to what Guido said, models with predicted ndt are hard to initialize since the ndt must be initialized so that it is smaller than the corresponding responses. Please note that, by default, ndt is modelled on the log-scale if predicted. In my experience, setting the initial values of the ndt intercept to some small(ish) values, e.g., -3 or -5 on the log scale should suffice. This could be done as follows:

inits <- list(list(b_ndt = as.array(-5)))

The length of the outer list must match the number of chains. Of course, the function approach you are doing could work as well.

Ross_Kempner · November 21, 2020, 8:14pm

Hi,

Did you get this to work? What was the solutions if so?

I tried Paul’s suggestion and had no luck.

Alasdair_Clarke · November 24, 2020, 9:50am

hi,

Do I need an up-to-date brms for this to work? (I am current a few versions behind (2.13.5), as I didn’t want to break my current projects when updating to RTools4, etc).

I’m having very similar problems to Richard, … it feels like 2 out of 3 times, the chains run fine and I get lovely results. But other times, things just fall over!

paul.buerkner · November 24, 2020, 9:52am

You should be fine with your version although I would appreciate you being more specific about your version number. The problem is simply with and ndt that is predicted in a distributional regression and I do not have a good solution for this problem yet.

Alasdair_Clarke · November 24, 2020, 9:57am

Sure - how do I find the more specific version number information? What information is required beyond v2.13.5? Sorry for my ignorance!

paul.buerkner · November 24, 2020, 10:04am

I am sorry, I misread your message. I thought you said that you are a few versions behind 2.13.5 but now that I reread your message I see that you are on 2.13.5. Please disregard my former post.

Alasdair_Clarke · November 24, 2020, 10:18am

no worries, easy done… (you do an excellent job in support… if i ever meet you in real life, I clearly owe you a beer (or equivalent of your choice!).

Thinking about my problems, I wonder if the issue is the cauchy distributions I am using for priors for the random intercepts… these have a fat tail right? So perhaps there are a load of cases in which a large value for ndt is generated via ndt ~ (1 | pid) and a low value from the rt ~ … + (1| pid).

Just thinking out loud here!

My current priors and model code:

myp ← c(
prior_string(“normal(-1.4, 0.2)”, class = “b”, coef = intercepts),
prior_string(“normal(0, 0.2)”, class = “b”, coef = slopes),
prior_string(“normal(-1, 0.5)”, class = “Intercept”, dpar = “ndt” ),
prior_string(“cauchy(0, 0.4)”, class = “sigma”),
prior_string(“cauchy(0, 0.1)”, class = “sd”),
prior_string(“cauchy(0, 0.1)”, class = “sd”, dpar = “ndt”))

m ← brm(
bf(
rt ~ 0 + d_feature + d_feature:log(N_T+1) + (1|p_id),
ndt ~ 1 + (1|p_id)),
family = shifted_lognormal(),
data = sample_frac(d,0.1),
prior = myp,
chains = 1,
iter = 3000,
inits = function(){
list(b_ndt = array(-5))
},
control =list(adapt_delta = 0.9)
)

Actually, while writing this out, this actually all worked! (In my latest run, I decreased the two sd cauchy priors to cauchy(0, 0.1). So perhaps my much was right, or perhaps it was a lucky coincidence. [It still rejected one initial start value, but then was fine…]

I’ve just set this running on my whole dataset (rather than only 10%). This time, it is having to reject a lot of start values to get started.

I feel like I’m very close to getting this working… it works sometimes, and when it does, the model fit is clearly a lot better than the other models we’ve been trying. But it’s not yet robust enough for the paper that we’re writing!

paul.buerkner · November 24, 2020, 10:43am

Great. With regard to the starting values, I recomment that you set starting values for the Intercept, which is not part of the “b_ndt” vector. I think using

list(Intercept_ndt = array(-5))

should be correct, but please check in the generated Stan code (via make_stancode) if Intercept_ndt is indeed the correct intercept name in your brms version.

Alasdair_Clarke · November 24, 2020, 10:56am

Thanks!

Do you mean

inits = list(list(Intercept_ndt = array(-5)))

though? While Intercept_ndt looks correct:

parameters {
vector[K] b; // population-level effects
real<lower=0> sigma; // residual SD
real Intercept_ndt; // temporary intercept for centered predictors
vector<lower=0>[M_1] sd_1; // group-level standard deviations
vector[N_1] z_1[M_1]; // standardized group-level effects
vector<lower=0>[M_2] sd_2; // group-level standard deviations
vector[N_2] z_2[M_2]; // standardized group-level effects
}

We get some nasty error messages:

myp ← c(
prior_string(“normal(-1.4, 0.2)”, class = “b”, coef = intercepts),
prior_string(“normal(0, 0.2)”, class = “b”, coef = slopes),
prior_string(“normal(-1, 0.5)”, class = “Intercept”, dpar = “ndt” ),
prior_string(“cauchy(0, 0.4)”, class = “sigma”),
prior_string(“cauchy(0, 0.1)”, class = “sd”),
prior_string(“cauchy(0, 0.1)”, class = “sd”, dpar = “ndt”))

m ← brm(
bf(
rt ~ 0 + d_feature + d_feature:log(N_T+1) + (1|p_id),
ndt ~ 1 + (1|p_id)),
family = shifted_lognormal(),
data = d,
prior = myp,
chains = 1,
iter = 3000,
inits = list(list(Intercept_ndt = array(-5))),
control =list(adapt_delta = 0.9)
)

SAMPLING FOR MODEL ‘10349aea9f518cbea357ca757468a402’ NOW (CHAIN 1).
Chain 1: Unrecoverable error evaluating the log probability at the initial value.
Chain 1: mismatch in number dimensions declared and found in context; processing stage=parameter initialization; variable name=Intercept_ndt; dims declared=(); dims found=(1)
[1] “Error in sampler$call_sampler(args_list[[i]]) : "
[2] " mismatch in number dimensions declared and found in context; processing stage=parameter initialization; variable name=Intercept_ndt; dims declared=(); dims found=(1)”
[1] “error occurred during calling the sampler; sampling not done”

paul.buerkner · November 24, 2020, 11:02am

Sorry, it should just be list(Intercept_ndt = -5) , that is without array(). Whether you need another list() wrapper depends on whether you specify inits via list if lists() on per chain or via a function.

Alasdair_Clarke · November 24, 2020, 11:26am

Thanks! (I assume the ndt = 5 should have a minus)

This is now running, and there were no rejected values. So fingers crossed this all works reliably.

My colleague is running the same model, (although with 4 chains) and is getting a few rejected values. Is this something that is to be expected, and can probably be ignored? Or should she try some other values? (I think she tried -10, and that gave 1 rejected value over her four chains).

Do you have a paper out on fitting these distributions that you’d like cited? It’s the least we can do for your help!

paul.buerkner · November 24, 2020, 11:55am

You can ignore the reject values. Right now, I do not have a paper to be cited specifically except for the standard brms papers of course.

Topic		Replies	Views
Shifted_lognormal model: prior for ndt leads to rejected inital values brms fitting-issues , priors	4	582	January 4, 2021
Shifted log normal distribution with varying shifts by participant brms	12	3354	November 23, 2020
Priors for shifted lognormal model brms	5	1567	January 14, 2022
Error when using brms and shifted lognormal model Modeling rstan , brms	10	647	September 14, 2023
Setting priors in brms using shifted_lognormal family brms brms	8	948	June 11, 2022

Setting starting values on ndt intercept of shifted lognormal model

Related topics