# Retrieve sigma on original scale when sigma is a function of a predictor: Sigma ~ x - 1

I fit a model that has heteroskedasticity in brms where sigma is a function of a predictor. But when I generate data and then try to retrieve the parameter values, I’m not succeeding. This may be more of a statistical question than a technical one, but I’m not sure. Here is a very simple example to illustrate the problem:

R:

``````  n <- 100
x <- seq(1,100,length.out = n)
my_sigma_x <- 1.2
sigma <- my_sigma_x * x
y <- rnorm(n,0,sigma)
simple <- data.frame(x,y)

form <- bf(y ~ 1,
sigma ~ x - 1)

m_simple <- brm(form, chains = 2, cores = 2, data = simple)
summary(m_simple)
``````

Family: gaussian
Links: mu = identity; sigma = log
Formula: y ~ 1
sigma ~ x - 1
Data: simple (Number of observations: 100)
Samples: 2 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 2000

``````Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept    -0.25      0.48    -1.17     0.65        852 1.00
sigma_x       0.10      0.00     0.10     0.11       2000 1.00
``````

I expected sigma_x to be 1.2, instead it is 0.1. For other values of my_sigma_x, sigma_x is also off.

Below are some values of my_sigma_x that I tested and the corresponding sigma_x from stan.

``````my_sigma_x <- c(.0001, .001,.05,.1,.3,.6,.9,1.2,1.5, 2.5, 5)
stan_sigma_x <- c(-0.056, -0.03, .017, .02617, .05202, .076358, .0879, .10, .116, .14, .19)
d <- data.frame(my_sigma_x, stan_sigma_x)
``````

Here’s a plot of this data: A square root transformation of my_sigma_x makes the relationship linear for positive values of stan_sigma_x, but it is not one-to-one. I fit a ols model to find the ratio between sqrt(my_sigma_x) and stan_sigma_x:

`summary(lm(stan_sigma_x ~ sqrt(my_sigma_x), data = d[3:11,]))`

``````Call:
lm(formula = stan_sigma_x ~ sqrt(my_sigma_x), data = d[3:11,
])

Residuals:
Min         1Q     Median         3Q        Max
-0.0064202 -0.0049437  0.0009736  0.0023291  0.0066590

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)      0.003885   0.003362   1.156    0.286
sqrt(my_sigma_x) 0.086104   0.002893  29.759 1.25e-08 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.005219 on 7 degrees of freedom
Multiple R-squared:  0.9922,	Adjusted R-squared:  0.991
F-statistic: 885.6 on 1 and 7 DF,  p-value: 1.247e-08
``````

This describes the relationship between my_sigma_x and stan_sigma_x very well. The sqrt makes sense if variance and standard deviation are being confused somewhere, but what is the meaning of the coefficient on sqrt(my_sigma_x)?

What is going on here? Why is my_sigma_x not equal to sigma_x? Where is my misunderstanding?

• Operating System: mac os 10.13.6 High Sierra
• brms Version: 2.3.1
1 Like

If you check the stan code with `make_stancode(form, data = simple)`, you’ll see that brms exponentiates sigma before returning it when you fit it. If you try exp(0.1), you’ll get a value of 1.11, which is pretty close to your value of 1.2.

Thanks for the suggestion, mathesong. I noticed the exp(sigma) in the code too (I think that’s to prevent sampling too close to 0?), but I don’t think that’s the solution. exp(0.1) is close to 1.2 by luck. For other values, the difference is substantial: exp(.116) is not close to 5.

Here is the generated stan code (removing blank blocks).

``````// generated with brms 2.3.1
data {
int<lower=1> N;  // total number of observations
vector[N] Y;  // response variable
int<lower=1> K_sigma;  // number of population-level effects
matrix[N, K_sigma] X_sigma;  // population-level design matrix
int prior_only;  // should the likelihood be ignored?
}
parameters {
real temp_Intercept;  // temporary intercept
vector[K_sigma] b_sigma;  // population-level effects
}
model {
vector[N] mu = rep_vector(0, N) + temp_Intercept;
vector[N] sigma = X_sigma * b_sigma;
for (n in 1:N) {
sigma[n] = exp(sigma[n]);
}
// priors including all constants
target += student_t_lpdf(temp_Intercept | 3, 0, 10);
// likelihood including all constants
if (!prior_only) {
target += normal_lpdf(Y | mu, sigma);
}
}
generated quantities {
// actual population-level intercept
real b_Intercept = temp_Intercept;
}
``````

It looks like sigma is exponetiated, and then used as an argument in normal_lpdf.

If I change the code from:

`target += normal_lpdf(Y | mu, sigma);`

to:

`target += normal_lpdf(Y | mu, log(sigma));`

then run the model:
`m2 <- stan("simple.stan", data = standata(m_simple))`

the sigma_x (here called “b_sigma”) value is correctly recovered:

``````Inference for Stan model: simple.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.

mean se_mean   sd    2.5%     25%     50%     75%   97.5%
temp_Intercept    1.14    0.02 0.91   -0.60    0.52    1.14    1.75    2.94
b_sigma        1.18    0.00 0.09    1.03    1.12    1.18    1.24    1.36
b_Intercept       1.14    0.02 0.91   -0.60    0.52    1.14    1.75    2.94
lp__           -524.96    0.02 0.99 -527.58 -525.37 -524.65 -524.24 -523.98
n_eff Rhat
temp_Intercept  2863    1
b_sigma      2857    1
b_Intercept     2863    1
lp__            1630    1

Samples were drawn using NUTS(diag_e) at Thu Aug  2 10:49:13 2018.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
``````

I think this is a bug. Should I post it to https://github.com/paul-buerkner/brms?

This also seems to be a problem in example 6 of the `?brm`. `sigma_xb` should be 2, but is about .8.

changing to `target += normal_lpdf(Y | mu, log(sigma));` solves the problem on this simple example. But on the more complicated non-linear model that prompted this question, it is causing problems when there is more than one `y` for a given value of `x`.

The chains fail to initialize because the scale parameter is not greater than 0. I’m trying to set the initial values to be greater than 0 for sigma using the `init` argument, but have not yet had success.

That’s intentional. To make sure that sigma is strictly positive regardless of how the linear predictor looks like, we have have to apply the exponential function (or equivalently the log-link). This is also documented in `?brmsfamily`

You may change the applied link function in the family argument, for instance `brmsfamily("gaussian", link_sigma = "identity")`

Needing sigma to be strictly positive makes sense to me. Thanks for pointing me to brmsfamily, I didn’t realize you could set the link function for sigma.

Switching the link function for sigma to identity causes “Rejecting initial value” errors because the values are less than 0 (as you pointed out would happen and is the reason why the link for sigma is “log”).

I suppose my question is really, “How can I get sigma back to the original scale, after I fit a model with `brmsfamily("gaussian", link_sigma = "log")`?”

Extract the posterior samples via `posterior_samples`, transform them via `exp` and then summarize via `posterior_summary`. But keep in mind that a linear effect on the log scale becomes multiplicative effect after back-transforming on the identity scale.

3 Likes

perfect. Thank you very much for your help. brms is an incredible package.