Scaling the variance using brms

I often fit a model of the following form in Stan:

y_i \sim N(\mu_i, \sigma_i) \\ \mu_i = \alpha + \beta x_i \\ \sigma_i = \sigma/w_i

Here, w_i is passed in as data. Is there any way to specify this model in brms? I’m looking into “distributional models” but I don’t see a good way of specifying the standard deviation to be \sigma/w_i.

Related: is there a way to fit the “8-schools” example using brms? It seems like similar syntax could be used to pass in the standard deviation for each observation.

I just discovered the resp_se() function, which can be used to pass in a known standard error (ala 8 schools). For example:

brm(outcome | se(my_se) ~ 1, data = dat)

However it’s not clear to me if this syntax can be used to pass in w_i and have the standard error be \sigma/w_i. Setting sigma=TRUE does not do it.

1 Like

resp_se(x, sigma = T) gives you a model with residual standard deviation equal to the root sum of squares of sigma and x_i. If I understand correctly, you want is a model where the residual standard deviation is sigma times x_i. You can achieve this with brms via distributional regression (see vignette here Estimating Distributional Models with brms). By default, distributional regression uses a log link, so you would want to model log(sigma) as an intercept m plus -log(w_i), so that when you exponentiate you’ll get m/(w_i). Now you just need to pass w_i as a covariate and constrain its coefficient in the distributional regression to be -1, which you can do via the prior argument; see here

Thanks @jsocolar, that solution seems perfect! I didn’t realize that the distributional models default to a log link for the variance.

However, I was hoping for a more flexible solution, because I have a follow-up question: what if I want to define \sigma_i = \sqrt{\sigma^2/w_i + \tau^2} ?

Unfortunately I don’t think there’s a neat way to do this using the distributional approach. Unless there’s a way to specify the model for sigma as nonlinear?

A log link for the sd, I think.

It looks to me like you want the standard deviation to be the root-sum-of-squares of some constant term and a residual sigma. The residual sigma has the form given in the distributional regression solution, and the root-sum-of-squares part has the form of the resp_se() solution. I think it should be possible to combine these to get what you want, though I can’t promise that it works as intended.

Maybe @paul.buerkner can chime in about this, which if possible could be a more general and less error-prone way to get what you want.

1 Like

Yes, of course.

I think I see what you’re saying regarding combining the distributional regression with resp_se(). Something like this:

bf(Y | se(1, sigma = TRUE) ~ 1, sigma ~ log(w))

I need to study the resulting Stan code for awhile but on first glance it looks right. Either way, it would be nice if @paul.buerkner could chime in to confirm.

Thanks for your help!

Just realized that you don’t need resp_se. Since tau is constant, you just need to add an observation-level random effect with fixed sd tau.

Ah, that makes sense (and it’s really the reason that I want to include the \tau term). I guess the best way to include this effect to create on obs variable (1:n) and including (1|obs) in my code?

FYI my last suggestion wasn’t quite right, it implemented \sigma_i = \sqrt{1+\sigma^2/w_i}.

If tau is data, then just change bf(Y | se(1, sigma = TRUE) ~ 1, sigma ~ log(w)) to bf(Y | se(tau, sigma = TRUE) ~ 1, sigma ~ log(w)). If tau is a parameter to be estimated, then the observation-level random effect should do it. In effect you’ll be estimating two residuals, one that’s structured by w and one that’s unstructured. This might lead to lack of identifiability and very poor computation in a way that a nonlinear formula on sigma would not be prone to. So worth waiting to see what Paul says.

Your initial solution looks sensible to me from what I am understanding. I would have proposed the same.

This is exactly what I want to estimate: one part of the residual that is structured by w and the other part that is unstructured. I’ve estimated these models before using Stan but never knew how to do it in brms before. Thanks again for your help!

Let’s assume a simple model

y_{i}\sim N(\mu_i, ~\hat{\sigma}_i^2),\\ \mu_i\sim N(0, \tau^2),\\ i=1,2,..,n,

where \hat{\sigma}_i (labeled as SE in the data frame) is the measurement error for the i-th observation.

Which ones are correct among the following brms implementations? And what are their differences?

(1) brm(bf(y | se(SE, sigma = TRUE) ~ 1, sigma ~ 1), ...)
(2) brm(bf(y | se(SE, sigma = TRUE) ~ 1, sigma ~ 1+(1|obs)), ...)
(3) brm(y | se(SE, sigma = FALSE) ~ 1 + (1|obs), ...)
(4) brm(y | se(SE, sigma = TRUE) ~ 1, ...)

Hi @jsocolar @paul.buerkner ,

I’m resurrecting an old thread. Going back to the question in the original post (ignoring the \tau component), I have implemented the following:

brm(bf(Y ~ x, sigma ~ log(w), data = dat,
    prior = prior(constant(-1), class = “b”, dpar = “sigma”))

I think this should set up the model we talked about, where the residual for observation i is \sigma/w_i. Thanks again for this solution.

My new question is on how to set the prior for \sigma. For example, I would like this prior to be half normal. When I look at the default priors (using get_priors), I see that there is a default prior of prior(t(3, 0, 0.25), class = "intercept", dpar = "sigma"). My understanding is that this would be the prior for the intercept in the sigma model, which given the log link, means that the default prior on \sigma is exp(student-t). Complicating matters even further, when I look at the Stan code, it looks like it is centering all the log weights, which changes the interpretation of the intercept.

Is my understanding correct? If so, given the above, is there an easy way to set a prior directly on the \sigma parameter that is in my model?