Modelling sigma

di4oi4 · October 26, 2020, 11:10am

Is the following a correct approach for sigma modelling?

Let’s assume we have a Y variable named hours in lognormal scale. We would like to know how these hours changed in time (variable named year)

Data

set.seed(0)
pi ← 0
mu_log ← 2
sigma_log ← 0.99
N = 1000
hours = (1 - rbinom(N, 1, prob = pi)) * rlnorm(N, mu_log, sigma_log)
year = seq(1,10, 1)
df = data.frame(hours=hours, year=year)

Model

m = brm(bf(hours ~ year, sigma ~ year), data = df, family = lognormal())

Summary

Family: lognormal
Links: mu = identity; sigma = log
Formula: hours ~ year
sigma ~ year
Data: df (Number of observations: 1000)
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 4000

Population-Level Effects: 
                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept           2.06      0.07     1.92     2.19 1.00     3742     2379
sigma_Intercept    -0.01      0.05    -0.10     0.09 1.00     5050     3068
year               -0.01      0.01    -0.03     0.01 1.00     4144     2783
sigma_year         -0.00      0.01    -0.02     0.02 1.00     5213     3302

Temporal trend for mean hours

m %>% conditional_effects()

Temporal trend for the standard deviation of hours?

m %>% conditional_effects(dpar = “sigma”)

Can we say that the standard deviation of hours did not change during the observed years, meaning that variation remained unchanged? And am I correct that these hours on the sigma plot is on the original scale of Y variable(hours)?

martinmodrak · November 11, 2020, 11:24am

Sorry for taking long to answer - your question is relevant and well written.

That would be a bit of a simplification. Eyeballing from the plots the largest change the data are roughly consistent with (i.e. within the 95% credible interval) is sigma shifting from 1.07 to 0.93 or the other way around. Whether this is small enough to be negligible in practice really depends on your application. Also it is possible that the model is a bad fit for the data and sigma could for example actually oscillate in a sine-wave but as sine wave has no overall trend, fitting a linear model would give you “no change”. (obviously your simulated data have sigma as constant, so this concern is only valid for some real data I presume you want to analyze). Posterior predictive checks let you check if this is indeed an issue (e.g. using the stat_grouped check with stat = sd and grouped by year, see pp_check.brmsfit for details) .

That is IMHO incorrect - the plot AFAIK shows the sigma parameter for the lognormal family, which is the standard deviation of the log. (note that the estimate for sigma matches the sigma_log you used to simulate the data)

Best of luck with your model!

di4oi4 · November 29, 2020, 1:03pm

Thank you for noting that! Seems that different functions do not automatically convert this back to its original scale (conditional_effects(), add_fitted_draws()). Which formula should I use to get sigma in original scale?

martinmodrak · November 29, 2020, 2:54pm

There is AFAIK no built-in way to do this, but it should be relatively easy to do by directly processing he fitted samples. It is also a bit unclear on what should “sigma on the original scale” actually mean (which is IMHO why it is not reported by default). You could either take s1 = exp(sigma) which would mean that the 95% interval around mean mu[i] is mu[i] / s1^1.96 - mu[i] * s1^1.96. Or you culd be interested in the standard deviation of the response, but that is not constant and depends on mean (see wiki https://en.wikipedia.org/wiki/Log-normal_distribution for the formula).

Does that make sense?

di4oi4 · November 29, 2020, 3:22pm

Thank you very much Martin!

Considering the model:
hours ~ year, sigma ~ year, data = df, family = lognormal

Do I understand correctly that this model allows examining the standard deviation of these hours in time; however, the model’s sigma estimates do not equal to corresponding estimates on original data (e.g. sd(df$hours) calculated for each year)? Meaning that these two estimates are not exactly the same.

Which formula would you use to convert the sigma values of fitted posterior to the scale of hours variable?

Fitted posterior looks like this:

m %>% posterior_samples()

I tried the following. Is this correct?

newdata = df %>% data_grid(year)

m %>% add_fitted_draws(newdata = newdata, re_formula = NULL, scale = "linear", dpar = "sigma", value = "mu")  %>% mutate(sd_in_original_scale = exp((mu)+0.5*((exp(sigma))^2))*(sqrt(exp((exp(sigma))^2)-1)))

The mean of sd_in_original_scale is quite similar to sd(df$hours): 15.03 and 15.4, respectively.

StaffanBetner · November 29, 2020, 8:27pm

You could model the mean and the standard deviation of the lognormal distribution with some reparametrization. To do that in brms you have to specify a custom family. I submitted an issue to the brms github repo a while back doing exactly that. https://github.com/paul-buerkner/brms/issues/1027

Here is the code, which is slightly rewritten to take into account that one might want to model the standard deviation as well:

custom_family(name = "lognormal7", dpars = c("mu", "sigma"), 
              links = c("log", "log"), 
              lb = c(0, 0), type = "real") ->
  lognormal7


stan_funs <- "
  real lognormal7_lpdf(real y, real mu, real sigma) {
    return lognormal_lpdf(y | log(mu/sqrt(1+sigma^2/mu^2)), sqrt(log(1+sigma^2/mu^2)));
  }
  real lognormal7_rng(real mu, real sigma) {
    return lognormal_rng(log(mu/sqrt(1+sigma^2/mu^2)), sqrt(log(1+sigma^2/mu^2)));
  }
"


stanvars <- stanvar(scode = stan_funs, block = "functions")

log_lik_lognormal7 <- function(i, prep) {
  mu <- prep$dpars$mu[, i]
  if(NCOL(prep$dpars$sigma)==1){sigma <- prep$dpars$sigma}else
  {sigma <- prep$dpars$sigma[, i]}   ## [, i] if sigma is modelled, without otherwise
  y <- prep$data$Y[i]
  lognormal7_lpdf(y, mu, sigma)
}


posterior_predict_lognormal7 <- function(i, prep, ...) {
  mu <- prep$dpars$mu[, i]
  if(NCOL(prep$dpars$sigma)==1){sigma <- prep$dpars$sigma}else
  {sigma <- prep$dpars$sigma[, i]}   ## [, i] if sigma is modelled, without otherwise
  lognormal7_rng(mu, sigma)
}

posterior_epred_lognormal7 <- function(prep) {
  mu <- prep$dpars$mu
  return(mu)
}

and how to use it:

brm(data = mtcars, formula = bf(mpg ~ wt), 
    family = lognormal7, stanvars = stanvars) -> model_test1

brm(data = mtcars, formula = bf(mpg ~ wt, sigma ~ wt), 
    family = lognormal7, stanvars = stanvars) -> model_test2

expose_functions(model_test1, vectorize = TRUE) # exposes the Stan functions to R
model_test1 %>% conditional_effects()
model_test2 %>% conditional_effects()

model_test1 %>% conditional_effects(method = "posterior_predict")
model_test2 %>% conditional_effects(method = "posterior_predict")

loo(model_test1, model_test2)

di4oi4 · November 30, 2020, 1:08pm

Thank you very much for providing one possible solution!

Some models run with the following errors on my own data and some don’t. What may be the problem?

Model structures are similar to this:

m = brm(bf(outcome ~ p1+ p2+ p3+ p4+ p5+ p6+ p7+ (1 | region), sigma ~ p1+ p2+ p3+ p4+ p5+ p6+ p7+ (1 | region)), family = lognormal7, chains = 3, cores = 3, data = data, stanvars = stanvars)

Errors

Click the Refresh button to see progress of the chains
starting worker pid=5980 on localhost:11689 at 14:59:49.828
starting worker pid=3820 on localhost:11689 at 14:59:50.206
starting worker pid=14536 on localhost:11689 at 14:59:50.548

SAMPLING FOR MODEL '68e2a36fcf46bf5a362f610e63a9307f' NOW (CHAIN 1).
Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: Exception: lognormal_lpdf: Scale parameter is 0, but must be > 0!  (in 'modelb206470786e_68e2a36fcf46bf5a362f610e63a9307f' at line 5)
  (in 'modelb206470786e_68e2a36fcf46bf5a362f610e63a9307f' at line 98)

Chain 3: Rejecting initial value:
Chain 3:   Error evaluating the log probability at the initial value.
Chain 3: Exception: Exception: lognormal_lpdf: Scale parameter is 0, but must be > 0!  (in 'modelb206470786e_68e2a36fcf46bf5a362f610e63a9307f' at line 5)
  (in 'modelb206470786e_68e2a36fcf46bf5a362f610e63a9307f' at line 98)

Chain 3: Rejecting initial value:
Chain 3:   Error evaluating the log probability at the initial value.
Chain 3: Exception: Exception: lognormal_lpdf: Location parameter is -inf, but must be finite!  (in 'modelb206470786e_68e2a36fcf46bf5a362f610e63a9307f' at line 5)
  (in 'modelb206470786e_68e2a36fcf46bf5a362f610e63a9307f' at line 98)

Chain 3: Rejecting initial value:
Chain 3:   Error evaluating the log probability at the initial value.
Chain 3: Exception: Exception: lognormal_lpdf: Scale parameter is 0, but must be > 0!  (in 'modelb206470786e_68e2a36fcf46bf5a362f610e63a9307f' at line 5)
  (in 'modelb206470786e_68e2a36fcf46bf5a362f610e63a9307f' at line 98)

Chain 2: Rejecting initial value:
Chain 2:   Error evaluating the log probability at the initial value.
Chain 2: Exception: Exception: lognormal_lpdf: Scale parameter is 0, but must be > 0!  (in 'modelb206470786e_68e2a36fcf46bf5a362f610e63a9307f' at line 5)
  (in 'modelb206470786e_68e2a36fcf46bf5a362f610e63a9307f' at line 98)

Chain 2: 
Chain 2: Initialization between (-2, 2) failed after 100 attempts. 
Chain 2:  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] "Error in sampler$call_sampler(args_list[[i]]) : Initialization failed."
[2] "In addition: Warning messages:"                                        
[3] "1: package 'rstan' was built under R version 3.6.2 "                   
[4] "2: package 'StanHeaders' was built under R version 3.6.2 "             
[5] "3: package 'ggplot2' was built under R version 3.6.2 "                 
[1] "error occurred during calling the sampler; sampling not done"

StaffanBetner · November 30, 2020, 1:20pm

I have encountered problems like that as well. Maybe some weak priors could help? I don’t know. I will try to make some simple model that fails and see if I can solve it.

di4oi4 · November 30, 2020, 3:29pm

Specifying inits = “0” within brm() function would be a solution?

Would it be possible to make the reparametrization in a way that it would give a median instead of a mean?

StaffanBetner · November 30, 2020, 8:34pm

Might work, haven’t tried yet! Did it work for you? I think the error is a bit weird since both \sigma and \mu on the natural scale is confined to be positive with the log link.

di4oi4 · December 1, 2020, 10:20am

Yeah, this seemed to work yesterday and today also. No errors with two different laptops. Just brm(…, inits = “0”)

Your provided solution is very good, thank you again! However, I have played for hours and hours with R to figure our how to manually convert brms default lognormal model estimates to their natural scale. Unfortunately, I have failed hundreds of times :)

According to the ProbOnto reference, your model does the following re-parametrization: LN7 > LN1

Then the default brms lognormal family should be LN1? However, when using the formula LN1 > LN7 for transforming the model posterior of default brms family, I do not get similar results to your model. My approach seems to be fundamentally wrong… Could you give some hints?

	original data	manually transformed	your re-parametrisation
mean	11.9	7.3	11.8
sd	15	0.085	15.4

Manual transformation was as follows

m = brm(bf(hours ~ 1, sigma ~ 1), data = df, family = lognormal())

df6 = m %>% posterior_samples()

mu = mean(df6$b_Intercept)

sigma = mean(df6$b_sigma_Intercept)

#LN1 -> LN7
mean = exp(mu+1/2*sigma^2)

#LN1 -> LN7
sd = exp(mu + 0.5*sigma^2)*sqrt(exp(sigma^2)-1)

StaffanBetner · December 1, 2020, 10:56am

Hint: Yes, it is.

Jokes aside, there are two errors that I see.

Since the sigma is modeled with a log link (sigma ~ 1) you need to inverse that link for the posterior values of the sigma intercept in this case. If sigma ~ 1 is omitted then the sigma will be provided on its proper scale. If you want to model it you have to get the posterior fitted values for both mu and sigma.
The other error (which have much smaller impact in this specific case) is that you calculate the mean of the posterior mu_log and sigma_log (LN1 parametrization) and then calculate the inverse of the “link” (exp(\mu+\sigma^2/2), when you should do it the other way around, i.e. the mean of the posterior values of exp(\mu+\sigma^2/2). This can be summarized in the expression E(g(X)) \ne g(E(X)). See my example code below:

pacman::p_load(tidyverse, magrittr, janitor, brms) # or just library(...) every package

mu = 1
sigma = 1
(mu_log = log(mu/sqrt(1+sigma^2/mu^2)))
(sigma_log = sqrt(log(1+sigma^2/mu^2)))

rnorm(1000000, mean = mu_log, sd = sigma_log) %>% exp() %>% mean() # roughly 1, as expected
rnorm(1000000, mean = mu_log, sd = sigma_log) %>% exp() %>% sd() # roughly 1, as expected

dataset = data.frame(y = rnorm(10000, 
                               mean = mu_log, 
                               sd = sigma_log) %>% 
                       exp())

brm(formula = y ~ 1, data = dataset, family = lognormal()) -> model1

summary(model1)

model1 %>% posterior_samples() %>% as_tibble() %>% clean_names() %>% select(sigma:intercept) -> posteriorvalues
mu_log_posterior = posteriorvalues %>% pull(intercept) # no exp, since it is on the identity scale
sigma_log_posterior = posteriorvalues %>% pull(sigma) # on LN1's natural scale, if "sigma ~ 1" then %>% exp() on the end

# Then transform the posterior values to the natural parameters
mu_natural_posterior = exp(mu_log_posterior+sigma_log_posterior^2/2)
mu_natural_posterior %>% summary

sigma_natural_posterior = sqrt((exp(sigma_log_posterior^2)-1)*exp(2*mu_log_posterior+sigma_log_posterior^2))
sigma_natural_posterior %>% summary

# Compare, not very different, but not equal
exp(mean(mu_log_posterior)+mean(sigma_log_posterior)^2/2)
mean(exp(mu_log_posterior+sigma_log_posterior^2/2))

I am also glad to hear that this worked! I tried some models that was probably too difficult to estimate even with heavy priors and initialization at zero, but got lots of diverging transitions. Unsure whether they were true or false though.

di4oi4 · December 1, 2020, 1:15pm

Works like charm! Thank you!

Why these formulae do not match exactly, meaning the part …exp(2*mu…? Matching these formlae gave different results. Or these things are not as straight forward as I expect.

And when I have a model with sigma ~1, should I exp() its posterior values also for other transformations? E.g. LN1 > LN4 to get coefficient of variation.

StaffanBetner · December 1, 2020, 2:14pm

They are equal. \sqrt{exp(2\mu+\sigma^2)}=e^{(2\mu+\sigma^2)^\frac{1}{2}}=e^{\frac{1}{2}(2\mu+\sigma^2)}=e^{\mu+\frac{\sigma^2}{2}}

Depends on the link-function. exp() is the “anti-log”, so it is suitable for parameters with log-links. I suppose log link might be suitable for coefficient of variation as well? But I don’t know.

Topic		Replies	Views
Retrieving sigma coefficients when sigma comes from a function Modeling specification , brms	9	1025	April 28, 2021
Retrieve sigma on original scale when sigma is a function of a predictor: Sigma ~ x - 1 brms	8	2593	August 2, 2018
Transform sigma values (log) to original scale General covariance , brms	1	75	February 19, 2025
Modelling sigma brms	2	276	February 28, 2024
Checking interpretation of sd parameter brms interpret-results	15	2742	July 11, 2019

Modelling sigma

Related topics