Joint Modeling: Propogating uncertainty into predictors

Hi, I am wondering whether it is possible to fit a joint model in which one response variable is first modeled to estimate its latent value and associated uncertainty, and then this latent variable is used as a predictor in a second model, with both components estimated simultaneously so that uncertainty is properly propagated

I am aware of mi/me/se, but was wondering if the joint method was possible in brms?

@paul.buerkner

Thanks.

What is the difference between what you want and what you get with mi/me/se ?

From my understanding when you model an uncertain or latent quantity and use that latent quantity inside a second likelihood within a single joint model, all parameters are estimated simultaneously. This means the uncertainty in the latent predictor is naturally propagated into the response through the posterior, since the latent predictor itself is a sampled parameter.

But are you saying that if I fit the first univariate model, then for each draw estimate the linear predictor for each individual, then repeat for each draw (e.g., 4000), then take the mean and sd for each individual across all these draws (then apply inverse link function if needed), then put this estimation into me/mi for the other univariate model, this is effectively the same propagation of uncertainty?

Is this equivalent to the joint method?

-Hunter

I’m saying that mi/me/se is doing the first thing that you want already.

I know that both approaches propagate uncertainty, but they do so differently. Despite this difference, I am wondering whether the results are equivalent to those from a full joint model.

My understanding is that the me() function models the predictor as a latent true value, using the observed value and its standard deviation. This assumes that the measurement error follows a normal distribution.

I also read that imputing using brm_mulltiple() is one of the closest practical approaches to joint modeling. In this approach, you first estimate the posterior distribution of the predictor in a separate model. Then, for each posterior draw, you generate predicted values of the predictor for each observation, creating multiple imputed datasets. The regression model is then fit across these datasets using brm_multiple, which combines the results and propagates uncertainty in the predictor into the final parameter estimates.

So my main questions are:

1.) Is it possible to do joint modeling in brms.

2.) is brm_multiple imputing the best option of propagating besides joint modeling.

-Hunter

I see. The critical detail that I had missed is that you want to model the true latent value via its own brmsformula that is more sophisticated than a covariate-free normal distribution.

Do I now understand correctly that you want to treat the linear predictor of a regression as the latent value, so the linear predictor of the first regression is used as the covariate in the second (rather than assume that the true latent value is distributed around the linear predictor with some error?).

That is, unlike via the mi/me approach, you do not want to treat the latents as random in the model, but rather as a deterministic function of the fitted parameters of the first regression?

Yes, that is closer to what I meant.

I want the latent quantity to be defined by its own regression model, specifically the linear predictor (or possibly the inverse-link mean) from the first model and then used as a covariate in the second model.

My question is whether this can be implemented as a fully joint model in brms, where both likelihoods inform the shared parameters, or whether the two-stage posterior-draw plus imputation approach via brm_multiple() is the closest practical approximation.

-Hunter

I’m not 100% sure if this is straightforward to achieve in brms or not. I have two candidate approaches in mind.

Approach 1: Play around with a combination of the multivariate model syntax (Estimating Multivariate Models with brms) and the nonlinear formula syntax (Estimating Non-Linear Models with brms). My suggestion is to see if you can define the linear predictor as a “nonlinear parameter” using the nonlinear formula syntax, and then define both sides of the multivariate formula using that nonlinear parameter (yes I know that both regressions are actually linear, but the nonlinear syntax might still be what you need.

Approach 2: This approach only works if the response distributions in the two regressions are of the same family, but it has the advantage of not requiring you to depend on mvbind. Stack the two dataframes on top of each other, with an extra indicator covariate that simply codes whether the response is of the first model type or the second. Then construct the shared nonlinear parameter giving the linear predictor of interest (ensure that the relevant covariates are appropriately replicated in both dataframes, which guarantees that the nonlinear parameter will take the same value in the relevant pairs of rows). Then, construct a nonlinear formula that gives the correct linear predictors for both regressions as a function of the value of the nonlinear parameter, incorporating other nonlinear parameters as needed. Finally, if the response family is one with a scale parameter, use distributional regression on the scale parameter to allow for different fitted scales for the response distribution in the two regressions (Estimating Distributional Models with brms).

I’m pretty sure either of these can be made to work, though some fiddling might be required. I’ll also note, just for what it’s worth, that the model you want would be quite straightforward in raw Stan.

I didn’t read all replies but perhaps the re() term on the brms3 branch on github can do what you have in mind. With re() terms you can use random effects of one model component as predictors in another model component.

Do you think a two-step approach like the following would provide a reasonable approximation to a fully joint model?

First, take a single posterior draw from the first model and compute the corresponding mean prediction for each observation under that draw. Then use those predicted values as imputed inputs in the second model. Repeat this procedure for each posterior draw and combine the resulting fits using brms_multiple (I believe this is what is done in packages like bisonR).

Or would you suggest just just using stan?

Thanks.

Hi Paul,

Ok I will try this, but as I asked @jsocolar do you think that the two step approach is close approximation?

Thanks.

We will have a paper on this soon. Short answer: If REs are precisely enough estimated, then yes. If the link in the second model is identity (i.e. a linear model), likely. If neither of the two is satisfied, you will likely benefit from a one-step approach. If I remember, I will post the paper here once we have a preprint.

@paul.buerkner my understanding of the OP is that the goal is not to use a random effect as a predictor in the other component, but rather to use the linear predictor from one component as a predictor in the other component. There may or may not be a random component of the linear predictor; the OP isn’t clear. But in either case it’s important to bring through the fixed effects alongside any random component. Is that achivable in brms3 without hacking around with nonlinear syntax?

Ah, I see. Yeah, this is not yet conveniently possible but I have an issue for that on github (couldn’t find it immediately, but it’s there). I hope that the final brms 3 will handle this case (but not yet).

Hi,

Yes, I want to use the linear predictor from the first model as the predictor in the second model.

For the scenario of my question the first and second model both have random effects in them. But I don’t want to use the estimated random effects from the first model as a predictor in the second model.

So hopefully now that this is clear, is estimating the linear predictor for each draw of the first models posterior and then using brms_multiple() to propagate the uncertainty a good approximation of the joint model posterior?

@paul.buerkner

  • Hunter

That’s a complex question for which I don’t have a general answer. Your approach is definitely reasonable even if not necessarily given the perfect answer.

Ok, so would you recommend just running this model in brms, then running the joint version of it directly in Stan and see how they compare.

Because @jsocolar said this joint model is pretty straight forward in Stan.

Also if the joint model is straight forward in Stan, then why isn’t it in brms R? Is there some kind of conversion/compiling issue?

-Hunter

If you can code it up in Stan directly, that’s great! No reason for it not being in brms other than time and priorities

Ok, great just was checking to see if there were any weird things to know about.

Thanks @paul.buerkner & @jsocolar really appreciate your help.

FWIW, here is what I think should be the model you want in brms, via “approach 2” above, assuming that both regressions have the same response family. I haven’t tested this, and it may require tweaking.

EDIT: I did test it, at least as far as generating the stancode, having an LLM check it for correctness, and making sure it compiles and runs. There was a small problem in the first edition of this post that is fixed in this edit.

Below:

  • alpha is the linear predictor from the first regression (excluding the random effects)
  • alpha1 is the random effects from the first regression
  • beta0 is the linear predictor in the second regression, including random effects, but excluding the effect of the linear predictor
  • beta1 is the coefficient applied in the second regression to the linear predictor from the first.
  • v is an indicator variable supplied as a covariate in the data, taking value 0 if a row corresponds to a response from the first regression and 1 if the row corresponds to a response from the second regression.
  • cov1_x are covariates relevant in the first regression, and cov2_x are covariates relevant in the second regression.
  • group1 and group2 are stand-ins for all of the grouping variables used in random effects in the first and second regressions, respectively. They can contain extra “dummy” categories for responses that come from the other regression.
  • sigma is a second distributional parameter in the regression family (not relevant if family is Poisson, binomial, or some other family without a second parameter).
bf(
  y ~
    0 +
    (v - 1) * (alpha + alpha1) +
    v * (beta0 + beta1*alpha),
  alpha ~ cov1_1 + cov1_2,
  alpha1 ~ 0 + (1 | group1),
  beta0 ~ 1 + cov2_1 + cov2_2 + (1 | group2),
  beta1 ~ 1,
  sigma ~ v,
  nl = TRUE
)

If the regressions are of different families, you could probably achieve something similar with nonlinear synax while replacing v with an mvbind (approach 1 above), but I’ve never experimented with combining nonlinear syntax and mvbind myself.

2 Likes