Model distribution selection for response variable with peaks

Afreen_Khalid · June 5, 2020, 4:46pm

Hi all, I have a response variable that is non-normal and has several peaks. The scale it between 0-1$, with intervals of 0.01. I tried to run a Gaussian model at first but the pp_check plots look quite bad. Based on a previous post here (Gaussian vs. skew-normal model selection), I also tried to run a beta-binomial model after converting my response variable to cents instead of dollars by multiplying it by 100. beta-binomial ppcheck

Here is my model syntax for the two models

brm(ResponseCents | vint(100) ~block*f_transition* f_manner + (1 + block || id , family = beta_binomial2, data = aggrdata, chains = 4, cores = n_cores-1, iter = 4000, warmup = 2000, control = list(adapt_delta = 0.9999, max_treedepth = 15), stanvars = stanvars, inits = "0")

ResponseModel1 <- lme4::lmer(response ~  block*f_transition* f_manner + (1 + block || id ), data = aggrdata, control = lmerControl(optimizer = "Nelder_Mead", optCtrl=list(maxfun=200000)))

What other types of distributions might fit better?

julianquandt · June 8, 2020, 11:25am

I think @Guido_Biele (I hope it is ok if I tag you) recently worked on a project where they used a gamma-mixture model for multi-modal data? Maybe that could work. It might also help to mention your independent variables. Is there e.g. a grouping variable that could cause the multi-modality and could be modelled?

Guido_Biele · June 8, 2020, 11:54am

Hi,
Yes I implemented a mixture of gammas, but this wasn’t wit brms and needs coding of the model in Stan directly. The model I used also wasn’t a regression model.

If you want to run a regression, the search term is mixture regression. I am not aware of Bayesian implementations, neither do I know how well non-gaussian likelihoods are supported by the packages that implement it.

Finally, a standard regression could work if you have variables that predict the outcome well. That’s hard to say from your description. (I would recommend to always describe how the data was generated, i.e. what are the measurements, what predictors exist or were manipulated …)

Topic		Replies	Views
How bad is this pp_check? Should I alter the distribution? Modeling fitting-issues , specification , brms	28	279	March 24, 2025
Choosing a sampling distribution for left skewed data brms	15	1484	March 20, 2024
How to decide on whether to use a Multivariate Gaussian Mixture model? General mixture	7	649	March 10, 2021
Advice on a non linear regression model (brms) brms	2	629	February 28, 2019
Gaussian vs. skew-normal model selection brms loo	19	7267	August 26, 2019

Model distribution selection for response variable with peaks

Related topics