Modeling Multivariate Outcome as a Simplex of Proportions in brms

Hello all,

I’m working with some survey results where we have summarized proportions of responses to a particular item (essentially: for, neutral, and against), and a set of causally implicated predictors. I’ve taken the approach below to model the proportions as separate multivariate outcomes of the same set of predictors using beta regressions, but I assume brms (and, by extension, Stan) could do a better job if it knew the three proportions together form a simplex. Is there a way to specify this in brms? If not, what would the structure of the Stan model need to look like?

There are additional complexities, shown in the code below, in that the responses and some predictors have measurement error, and have a hierarchical structure. The goal is to then use the fitted model for prediction, given new inputs.

modelA = bf(PropA | mi(PropA_se) ~
              mi(predictorone) + mi(predictortwo) + predictorthree +
              (1 + mi(predictorone) + mi(predictortwo) + predictorthree | stratum),
            family=Beta()) +
  bf(predictorone | mi(predictorone_se) ~ 1, family=gaussian()) +
  bf(predictortwo | mi(predictortwo_se) ~ 1, family=gaussian())
modelB = bf(PropB | mi(PropB_se) ~
              mi(predictorone) + mi(predictortwo) + predictorthree +
              (1 + mi(predictorone) + mi(predictortwo) + predictorthree | stratum),
            family=Beta())
modelC = bf(PropC | mi(PropC_se) ~
              mi(predictorone) + mi(predictortwo) + predictorthree +
              (1 + mi(predictorone) + mi(predictortwo) + predictorthree | stratum),
            family=Beta())

multiFit = brm(mvbf(modelA, modelB, modelC, rescor=F),
               data=data,
               prior=c(prior(normal(0,1), class="b")),
               init="0",
               warmup=5000, iter=15000,
               cores=parallel::detectCores(), chains=4,
               control=list(adapt_delta=0.95, max_treedepth=15),
               backend="cmdstanr",
               seed=seed)

I appreciate any help or speculation on solutions. Thank you!

I’ve discovered this is an example of Dirichlet regression, which is now implemented in brms. I should be able to figure this out now given other resources available.

Only issue seems to be that it doesn’t support measurement error in the outcome variables. Any ways around this or advice on how best to incorporate this uncertainty into the posterior would be helpful.

I’m glad you found the Dirichlet. The measurement error step is going to be harder—I don’t know of any tools that would allow you to set up your model using simple formula-like syntax of brms. My suggestion would be to use brms::make_stancode to generate most of the Stan code for you, then add on measurement error for your response. When I’ve used this strategy, I’ve found it most helpful to start with a set of simpler models that individually capture each feature I want to implement. Here, maybe a single-predictor Gaussian regression with measurement error and a single-predictor Beta (or single-predictor Dirichlet) regression without. Layer on the group-level slopes and intercepts once the core of the model is working. The forums are here to help if/when you get stuck on particular implementation steps.

1 Like