Bernoulli variables of multivariate brms model poorly estimated

jgoldberg · April 27, 2019, 6:34pm

Greetings all,

I am working with a multivariate model of 4 responses each with a different distribution and am generally interested in how these responses depend upon two variables (one continuous and one factor) and their interaction. I also have a group-level intercept in each formula and would like to incorporate the correlations across the response variables for groups. I have constructed the model as follows in brms:

groform = bf(bwgr~b0+b1*(tlc10-switch)*step(switch-tlc10)+b2*(tlc10-switch)*step(tlc10-switch), b0~1+treat+(1|V|Channel), b1+b2~1+treat,b3~1, nlf(switch ~ inv_logit(b3)*8-3.05), nl=T) + gaussian()
survform = bf(Surv~tlc*treat+(1|V|Channel)) + bernoulli()
fecform = bf(fec~tlc10*treat+(1|V|Channel), hu~tlc10*treat+(1|V|Channel)) + hurdle_poisson()
osform = bf(off.size~tlc10*treat +(1|V|Channel)) + skew_normal()

mvitals = brm(groform+survform+fecform+osform+set_rescor(F), data=data, chains=12, iter=2000, warmup=1000, sample_prior=T, save_all_pars=T, control=list(adapt_delta=0.995, max_treedepth=15), prior = mvpriors)

The model runs without throwing any warnings and the immediate diagnostics look OK. However, the posterior for the Surv Bernoulli response and the hurdle component of the fec response look wrong. The posterior distribution for the intercept in each of these response components centers on extremely small values (< -5 on the logit scale), which do not match the data - the mean of Surv is ~0.92 and 52% of fec observations take on a non-zero value. This ill-behaved posterior behavior also occurs when I remove the response correlations across groups to construct what would be 4 univariate models. I have tried placing more informative priors on these intercept terms just to see if it would help get things working, but the issues persisted. The posterior distributions do, however, look reasonable and match expectations when I fit the univariate models independently in separate calls to brm. What might be the source of this divergence? How might I improve sampling in the multivariate model?

Thank you for any help you can provide!

Please also provide the following information in addition to your question:

Operating System: 10.14.4
brms Version: 2.8.0

paul.buerkner · May 4, 2019, 12:22pm

Possibly a stupid question: Could it be that you continuous predictor only take on values which are far away from zero? That would explain the extreme intercept and did not indicate problems. It may also be I don’t fully understand your problem, though.

jgoldberg · May 4, 2019, 1:33pm

Hi Paul,

Thanks for responding. I’ve centered the continuous predictor at the mean prior to analysis. I’ve also attempted re-scaling from mm to cm - the continuous predictor is a length - but get roughly equivalent results either way. On the mm scale, values range from about -30 to 50 mm (so -3 to 5 in cm). The cm scaling seems to work reasonably well in the univariate models, in the sense that I can use fairly generic priors for all of the population-level parameters.

I’ve also played around with simplifying the multivariate formula, by removing one or two formulas to see if I can get more sensible results. In some combinations this seems to help. For example, groform + fecform leads to a well-behaved posterior, but groform + survform does not.

I’m still stumped on this one, so any help on breaking through the impasse is most appreciated!

paul.buerkner · May 4, 2019, 6:00pm

I have no specified idea to be honest. My best guess is that one of the models has a problem that then propagates the the other models, but this is not a new idea for you I guess. The non-linear model groform seems to be the one that is theoretically hardest to fit. Also, the hurdle part of fecform could be a problem if we do not have a lot of zeros in the data (at least not for some channels).

jgoldberg · May 7, 2019, 5:23pm

Ah, that makes sense. I hadn’t considered the relative number of zeros per Channel in the Surv or fec variables. I suspect that there are likely some problematic Channels for these variables with few or no zeros present. Alas, this problem does not present any ready-made solutions.

Thank you for the insight!

Topic		Replies	Views
A group-level intercept model using a multivariate brms formula? brms	3	996	September 10, 2018
Multivariate Logistic Regression with brms brms specification , brms	3	971	April 10, 2023
Multivariate model with forced single (shared) intercept and variance? brms techniques , specification	3	495	August 11, 2022
Question about posterior distribution over intercept parameter brms brms	14	1678	May 16, 2021
Help on brms modeling Modeling	9	433	January 22, 2023

Bernoulli variables of multivariate brms model poorly estimated

Related topics