Retrieving sigma coefficients when sigma comes from a function

Hello,

I would like to retrieve the sigma parameters using a distributional model with brms. The data is generated as:

b0 = 0.8
b1 = 0.2
a = 0.5
h2 = 0.4
h = sqrt(h2)
e = sqrt(1-h^2)

E = rnorm(1000, 0, 1)
N = length(E)
g = rnorm(N,0,1)

sigma = sqrt((b0*e + b1*e*E)^2)
y = a*E + b0*h*g + b1*h*E*g + rnorm(N,0,sigma)
dat = data.frame(y, g, E)

print(paste0("b0*e = ", round(b0*e, 3), " and b1*e = ", round(b1*e, 3))) 
"b0*e = 0.62 and b1*e = 0.155"

This is the model I am running but it has the problem of negative values for the scale parameter:

f = bf(y ~ g + E + g * E, sigma ~ 1 + E)
fit = brm(f, data = dat, family = brmsfamily("gaussian", link_sigma = "identity"), chains = 1)
summary(fit)

If I try the non-linear approach the values are right but still got the scale issue:

f = bf(y ~ g + E + g * E) + 
    nlf(sigma ~ sqrt(vest)^2) + 
    lf(vest ~ 1  + E)

fit = brm(f, data = dat, 
    family = brmsfamily("gaussian", link_sigma = "identity"), chains = 1, 
      prior = c(prior(student_t(7, 0, 8), class='b', coef = 'E', nlpar = 'vest')))

SAMPLING FOR MODEL '8714549d61cef2c62abe24dcb0310027' NOW (CHAIN 1).
Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: normal_lpdf: Scale parameter[1] is nan, but must be > 0!  (in 'model7d783aa92632_8714549d61cef2c62abe24dcb0310027' at line 42)
...

 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: y ~ g + E + g * E 
         sigma ~ sqrt(vest)^2
         vest ~ 1 + E
   Data: dat (Number of observations: 1000) 
Samples: 1 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 1000

Population-Level Effects: 
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept          0.00      0.02    -0.04     0.04 1.00      932      749
vest_Intercept     0.61      0.01     0.58     0.64 1.00      785      746
vest_E             0.16      0.01     0.14     0.17 1.00      813      546
g                  0.53      0.02     0.49     0.56 1.00      722      615
E                  0.50      0.01     0.47     0.52 1.00      844      765
g:E                0.13      0.01     0.11     0.16 1.00      711      813

Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

For sure I am making a silly mistake. Any suggestions on how to specify the model to recover b0*e and b1*e for sigma?

Thank you in advance!

1 Like

The model above would be equivalent to the Stan code below, still I don’t know how to avoid the square root of negative values in order to get the coefficients defining sigma.

// generated with brms 2.15.0
functions {
}
data {
  int<lower=1> N;  // total number of observations
  vector[N] Y;  // response variable
  int<lower=1> K;  // number of population-level effects
  matrix[N, K] X;  // population-level design matrix
  int<lower=1> K_sigma;  // number of population-level effects
  matrix[N, K_sigma] X_sigma;  // population-level design matrix
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
  int Kc = K - 1;
  matrix[N, Kc] Xc;  // centered version of X without an intercept
  vector[Kc] means_X;  // column means of X before centering
  int Kc_sigma = K_sigma - 1;
  matrix[N, Kc_sigma] Xc_sigma;  // centered version of X_sigma without an intercept
  vector[Kc_sigma] means_X_sigma;  // column means of X_sigma before centering
  for (i in 2:K) {
    means_X[i - 1] = mean(X[, i]);
    Xc[, i - 1] = X[, i] - means_X[i - 1];
  }
  for (i in 2:K_sigma) {
    means_X_sigma[i - 1] = mean(X_sigma[, i]);
    Xc_sigma[, i - 1] = X_sigma[, i] - means_X_sigma[i - 1];
  }
}
parameters {
  vector[Kc] b;  // population-level effects
  real Intercept;  // temporary intercept for centered predictors
  vector[Kc_sigma] b_sigma;  // population-level effects
  real Intercept_sigma;  // temporary intercept for centered predictors
}
transformed parameters {
}
model {
  // likelihood including constants
  if (!prior_only) {
    // initialize linear predictor term
    vector[N] mu = Intercept + Xc * b;
    // initialize linear predictor term
    vector[N] sigma = Intercept_sigma + Xc_sigma * b_sigma;
    for (n in 1:N) {
      // inverse function
      sigma[n] = sqrt(sigma[n])^2;
    }
    target += normal_lpdf(Y | mu, sigma);
  }
  // priors including constants
  target += student_t_lpdf(Intercept | 3, 0, 2.5);
  target += student_t_lpdf(Intercept_sigma | 3, 0, 2.5);
}
generated quantities {
  // actual population-level intercept
  real b_Intercept = Intercept - dot_product(means_X, b);
  // actual population-level intercept
  real b_sigma_Intercept = Intercept_sigma - dot_product(means_X_sigma, b_sigma);
}
1 Like

I’ve only skimmed your Stan code, but it appears to me that you want

sigma[n] = sqrt(sigma[n]^2);

instead of

sigma[n] = sqrt(sigma[n])^2

Even if I’ve got that right, it may not fix your problem, but it might be a start.

Kent

Thanks for your reply, @kholsinger !

The model below using brms gives me a good approximation of the coefficients that define sigma (vest):

f1 = bf(y ~ g + E + g * E) + 
    nlf(sigma ~ (sqrt(vest))^2) + 
    lf(vest ~  1 + E)


Population-Level Effects: 
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept          0.04      0.02    -0.00     0.08 1.00     7694     6459
vest_Intercept     0.60      0.01     0.58     0.63 1.00     7736     5898
vest_E             0.15      0.01     0.13     0.17 1.00     6958     6162
g                  0.49      0.02     0.45     0.53 1.00     7650     5821
E                  0.53      0.02     0.50     0.56 1.00     7592     5897
g:E                0.13      0.02     0.09     0.16 1.00     7499     6472

But I get these messages:

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: normal_lpdf: Scale parameter[1] is nan, but must be > 0!  (in 'model65de1eb63ec1_a2b8f3797cffb15953b3bf0a1f419319' at line 42)

If I use:

f2 = bf(y ~ g + E + g * E) + 
    nlf(sigma ~ sqrt(vest^2)) + 
    lf(vest ~  1 + E)

Population-Level Effects: 
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept         -0.74      0.34    -1.12    -0.39 3.04        2       11
vest_Intercept     1.24      2.00    -0.34     6.25 3.04        2       11
vest_E            -3.25      3.82   -13.38    -0.49 3.04        2       11
g                  0.11      1.55    -1.44     1.66 3.04        2       11
E                 -0.15      0.52    -0.67     0.37 3.03        2       11
g:E                1.69      0.26     1.43     1.96 3.04        2       11

The coefficients are way off. As far as I understand, the sigma function should include the inverse function.

Thank you again!
Sebastian

Hi,
I think the main problem is probably this:

This is equivalent to taking the absolute value of vest, hower, absolute value is a problem for Stan as it makes the posterior not smooth (the derivative is not continuous).

I understand you use it to be able to work with the “identity” link, but what is the reason you don’t want to use the default “log” link (which is used precisely because it avoids those issue)? Maybe understanding the motivation will help us find a better solution for you.

Alternatively you can use the “softplus” link (Rectifier (neural networks) - Wikipedia), which behaves almost like identity for positive values of the linear predictor.

Best of luck with your model!