Finite mixtures of other-than-gaussian distributions

Hi there

This might be a really simple question, but I’m quite new in brms/Stan, and I haven’t been able to figure out how to solve it or find related help in the forums.

Here’s what I need to do. It’s actually two different problems, but related to the same issue: finite mixtures of other-than-gaussian distributions.

In the first case, I need to adjust wind intensity data (in knots) to a singly-truncated-from-below (STFB) normal-Weibull mixture. The STFB normal term can accommodate null wind speeds. Here are the first 100 values of a much larger dataset

wi <- c(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 12, 12, 12, 3, 3, 3, 3, 3, 3, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 4, 4, 4)

In the second case, I need to adjust wind direction data (in radians) to a mixture of von Mises distributions (I’ll try different numbers of terms, typically, 2 or 3, since I know it is not yet possible to estimate the number of mixture components from the data). Here are the first 100 values

wd <- c(200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 180, 250, 210, 210, 210, 210, 210, 210, 210, 210, 210, 210, 210, 280, 280, 280, 280, 280, 280, 280, 280, 280, 280, 280, 260, 260, 260, 270, 270, 270, 270, 270, 270, 270, 270, 270, 270, 270, 270, 270, 270, 270, 270, 280, 280, 280, 280, 280, 280, 280, 280, 280, 260, 260, 260, 260, 260, 260, 260)*pi/180 – pi # Data need to be within (-pi, pi)

Both models are quite standard in meteorology. Neither variable depends on other variables. In both cases, I have tried to adapt the examples in the mixture() and pp_mixture.brmsfit() functions in the brms package documentation (which are specified for a mixture of gaussians) to no avail. Here’s the example in mixture()

set.seed(1234)
dat <- data.frame(
y = c(rnorm(200), rnorm(100, 6)),
x = rnorm(300),
z = sample(0:1, 300, TRUE)
)
## fit a simple normal mixture model
mix <- mixture(gaussian, gaussian)
prior <- c(
prior(normal(0, 7), Intercept, dpar = mu1),
prior(normal(5, 7), Intercept, dpar = mu2)
)
fit1 <- brm(bf(y ~ x + z), dat, family = mix,
prior = prior, chains = 2)

but I haven’t been able to make it work on my problem. I include my questions for both problems as comments to the code

dat <- data.frame(
y = wi, # or alternatively y = wd for the second problem
## (1) Do I need to specify x and/or z as in the previous example? 
# x = ??,
# z = ?? 
)

### fit an STFB-normal-Weibull mixture model
## (2) Is there any way I can specify a truncated-at-zero-gaussian as one of the terms in the mixture? I’ve read in the package documentation that “with the exception of categorical, ordinal, and mixture families, the response distribution can be truncated using the trunc function in the addition part,” but I'm not sure if I can truncate one of the mixture terms
mix <- mixture(gaussian, weibull) # ??

## (3) Besides, Stan complains that Family 'mixture(gaussian, weibull)' requires response greater than 0. Is there any way I can circumvent that apart from the obvious trick of adding a small positive epsilon to the data to avoid 0’s?

## (4) Main question. How can I specify the priors in this case, especially for the Weibull term? Do I need to use class = ‘b’ instead? Do I need to specify a priori both for alpha and sigma (the Weibull parameters?
# prior <- c(
# prior(prior = ??, class = ??, coef = ??, … ??), 
# prior(prior = ??, class = ??, coef = ??, … ??)
# )

### ALTERNATIVELY, for a mixture of von Mises distributions
mix <- mixture(von_mises(), von_mises()) # ??

## (5) Again, the same main question, how can I specify the priors in this case for the mu and kappa parameters of the von Mises distribution?
# prior <- c(
# prior(prior = ??, class = ??, coef = ??, … ??), 
# prior(prior = ??, class = ??, coef = ??, … ??)
# )

## (6) What shall I specify in the right-hand term?
# fit <- brm(bf(y ~ ??), dat, family = mix, prior = prior, chains = 2)

I would greatly appreciate any hint. Thank you very much in advance.

  • Windows 10
  • brms Version: 2.13.3

Hi,
sorry, it looks like your question fell through. I am not sure I understand everything you ask for, but will try to provide at least partial answers:

That completely depends on what you are trying to do. If you don’t need any covariates, you don’t need to include them in the data, neither in the right-hand term of the formula. If you are not familiar with linear models in general, it might be useful to check some introductory material on the topic (Statistical Rethinking has IMHO a nice one and videos from lectures are freely available).

If you are just trying to fit the parameters of a single distribution without any covariate, you can have y ~ 1

I am not sure brms has enough flexibility to specify the distribution exactly as you need it. I think you might be better served by implementing the distribution yourself see - Define Custom Response Distributions with brms for details on how to do that (but it is a bit more technically challenging)

You can use get_prior to see all the available parameters you can set prior for. If the output of get_prior is not clear, please ask for clarifications.

Best of luck with your model!