Predicting the variance, taking the mean into account

I am modelling the variance among individuals for a trait by sex, across a large number of categories. It is very similar to this example:

library(brms)
library(palmerpenguins)

penguin <- brm(
  bf(
    body_mass_g ~ 0 + sex + (0 + sex | i | year),
    sigma ~ 0 + sex + (0 + sex | i | year)
  ),
  family = 'lognormal', data = penguins, cores = 4
)

(This model doesn’t fit well, but my actual model fits fine.)

Both the mean and the standard deviation depend on sex, and they can vary across the categories (years in this case). In my real data, the mean and sigma are strongly related, and this correlation is captured by the | i | term.

Now, I want to use this model to get posterior prediction for sigma, for a previously unseen category. However, I want this prediction to be conditional on the mean, since this strongly effects the expectation for sigma. But, when I run this code, the mean-variance relationship is not taken into account, since the result is the same if I include the mean or not:

posterior_predict(
  penguin, 
  newdata = data.frame(sex = 'male', year = 2024, body_mass_g = 1e20), 
  allow_new_levels = TRUE,
  dpar = 'sigma'
) |> posterior_summary()

The mean-variance correlation is basically zero here, but my actual model it is very clearly not using the response variable, which I think does make sense generally. But this leaves my question:

Is it possible to make use of the mean when predicting sigma?

body_mass_g is the outcome variable… I’m not sure how you are “condition[ing] on the mean” in your predict code here. In your model, I think what you have done is simply correlate mu[n] and sigma[n] (use the stancode() function on your fitted model, or the make_stancode() function on the example above) via the varying intercepts and slopes. So, I think when you simply use the posterior_predict() function, you are making predictions from parameters that are correlated (I assume. You would need to dig into the details of the posterior_predict() function in brms. In any case, I would not think you would get a difference in the results of posterior_predict() by including body_mass_g, the outcome variable, in the call.)

1 Like

Right, I think I understand the model, and I can see why posterior_predict() behaves this way.

I am just trying to figure out if I can make a prediction for the sigma of a group, and make use of the fact that I have an estimate of mu for that group, given that the model includes their correlation among groups. It seems to me that I can make a better prediction for sigma if I know mu.

Aren’t you already getting the predictive benefit of “knowing mu” for each group, in the sense that the posterior fit has already taken into account the correlation between mu and sigma, since you included that in the model.

Or, to put it another way, if you don’t model sigma at all, or model sigma without an induced correlation with mu, do you get worse predictions than with the model that includes correlations?

2 Likes

Since I’m interested in sigma (not mu), I only tried the second. And yes, the model is much better when including their correlation (at least the uncertainty around sigma for any particular group is smaller). So in that sense the model is useful.

But I am trying to describe what the expected sigma (and the associated uncertainty) is for an unseen group. In my real data the groups are genes, and the outcome is expression. So I want to describe what the expected variance in expression is for a gene. But because this depends strongly on the average gene expression, I would like to describe what the expected expression variance would be, given a certain average expression.

If I just ask for a prediction of sigma for an unseen gene, this is very uncertain, since the range of mean expression is very wide.

Hope that makes some sense…

What about using the predicted mean gene expression as a predictor of sigma? (I think this can be done in brms, but might require a nonlinear formula.) I’m not sure if this is kosher, but just thinking out loud.

Use posterior_linpred(... , re_formula = ~ 0, dpar = "sigma") to get the posterior linpred for sigma without the random effect component. Then add in the sigma margin of the multivariate random effect conditional on the known mu value (note that the distribution for a subset of variables from a multivariate normal, conditional on known values for another subset of variables, is itself a multivariate normal distribution; this page gives the bivariate case succinctly).

2 Likes

@Ax3man, can you please post the code you ended up using to implement the approach suggested by jsocolar?