Combining a hurdle_lognormal with lognormal in a mixture model

johkaa · June 14, 2024, 2:08pm

We would like to fit a model combining a hurdle model and a finite mixture model. We want to model the probability for a zero, and then for the non-zero values, we want to assume that the observations come from two separate processes. The question is, is it possible to combine a hurdle model with a finite mixture model in brms code?

We are familiar with the mixture function where you can model two gaussian processes and their proportion. Can this be extended to a model that includes also the hurdle part? Naively we would think that this should work with using mixture (hurdle_lognormal, lognormal) for the family.

We tried the following code, and got this error:
Error: The parameter ‘hu’ is not a valid distributional or non-linear parameter. Did you forget to set ‘nl = TRUE’?

We tried setting the nl=TRUE, but it didn’t help.

mix <- mixture(hurdle_lognormal,lognormal)

hurdle3 <- 
  brm(bf(IA_DWELL_TIME ~ 1 , 
         mu1 ~ 1 , 
         mu2 ~ 1 ,
         theta2 ~ 1 ,
         hu ~ 1 ),
      data = df,
      family = mix)

Would be grateful if you had any ideas!

Ax3man · June 14, 2024, 8:40pm

When I run in brms v2.21.0

mix <- mixture(hurdle_lognormal,lognormal)

I get:

Error: Some of the families are not allowed in mixture models.

Christopher-Peterson · June 15, 2024, 2:35am

The defined mixture doesn’t really make sense to me.
The lognormal distribution only has support for data with positive values. The hurdle-lognormal modification allows the presence of zeroes by specifying a separate process for zero and non-zero components. I have a hard time conceiving of data that would require both the hurdle-lognormal and the lognormal?

Are you trying to run a mixture of two lognormals for the non-zero component? I’m not certain if you’ll be able to do that out of the box in brms, though it should be doable in Stan.

johkaa · June 15, 2024, 4:18am

The data are reading times of words within a sentence. Sometimes readers skip words, resulting in reading time = 0. The conditional reading time (i.e. the reading time if the word is NOT skipped) seems to have two peaks, suggesting that we have two processes driving those observations (reflected as faster and slower reading time). And as all reaction time measures, the distribution is right-skewed (thus, lognormal).

So yes, we are trying to run a mixture of hurdle for the skipping (0 = skipped), and two lognormals for the non-zero component.

Traditionally, the word skipping has been split into a separate variable of skipping probability, and the conditional reading time into another variable, and these are then modeled separately. Would be cool to include them in the same model.

Any ideas of how this could be done would be very welcome!

aakhmetz · June 15, 2024, 8:25am

Why not to use the hurdle mixture model then? I think it would be equivalent,

Christopher-Peterson · June 16, 2024, 3:20am

I agree that a mixture of two hurdle-lognormal models should work. Are you fitting any predictors to the zero component? If not, you can use set_prior() to constrain the second hurdle parameter to be equal to the first (using the constant() prior); this should should make the results mathematically equivalent to what you want.
If you are fitting predictors to your prior, you should still be able to use a constant constraint, but I don’t remember the details of how to do it; in any case, a bit of experimentation should reveal the right syntax.

Topic		Replies	Views
Hurdle Prior or Piecewise Function? Modeling specification , brms	1	38	September 8, 2024
Mean model for "hurdle" proportion brms	1	401	July 26, 2019
Brms: does the lognormal part of the hurdle_lognormal() regression include zeros into analysis? Modeling	8	2223	September 27, 2020
Estimates from hurdle_lognormal() hurdle and positive components are mirror image brms	3	912	June 17, 2020
How would one go about including measurement errors on response variables in a lognormal hurdle model with brms? brms brms	0	474	May 30, 2022

Combining a hurdle_lognormal with lognormal in a mixture model

Related topics