Fit a robust regression with mixture of normals -- sampling errors

Hi,
I am trying to fit a robust regression using code from https://jrnold.github.io/bayesian_notes/robust-regression.html, however I am getting sampling errors below. Is this an error with my installation or the model please. Thank you.

# SAMPLING FOR MODEL '92be42a71d12c73fccc7fa1ae9581a37' NOW (CHAIN 1).
# Chain 1: Rejecting initial value:
#   Chain 1:   Error evaluating the log probability at the initial value.
# Chain 1: Exception: gamma_lpdf: Shape parameter is -0.39896, but must be > 0!  (in 'model20e226d00e0a_92be42a71d12c73fccc7fa1ae9581a37' at line 35)
# 
# Chain 1: Rejecting initial value:
#   Chain 1:   Error evaluating the log probability at the initial value.
# Chain 1: Exception: gamma_lpdf: Shape parameter is -0.188494, but must be > 0!  (in 'model20e226d00e0a_92be42a71d12c73fccc7fa1ae9581a37' at line 35)
# 
# Chain 1: Rejecting initial value:
#   Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
# Chain 1:   Stan can't start sampling from this initial value.
# Chain 1: Rejecting initial value:
# Chain 1:   Error evaluating the log probability at the initial value.
# Chain 1: Exception: exponential_lpdf: Random variable is -0.239801, but must be >= 0!  (in 'model20e226d00e0a_92be42a71d12c73fccc7fa1ae9581a37' at line 32)
# 
...
...
...

This is code to reproduce

txt=
'
// Linear Model with Student-t Errors
data {
  int N;
  vector[N] y;
  int K;
  matrix [N, K] X;
  real scale_alpha;
  vector[K] scale_beta;
  real loc_sigma;
}
parameters {
  real alpha;
  vector[K] beta;
  real sigma;
  vector[N] inv_lambda2;
  real nu;
}
transformed parameters {
  vector[N] mu;
  vector[N] omega;
  for (n in 1:N) {
    omega[n] = sigma / sqrt(inv_lambda2[n]);
  }
  mu = alpha + X * beta;
}
model {
  real half_nu;
  alpha ~ normal(0.0, scale_alpha);
  beta ~ normal(0.0, scale_beta);
  sigma ~ exponential(loc_sigma);
  nu ~ gamma(2, 0.1);
  half_nu = 0.5 * nu;
  inv_lambda2 ~ gamma(half_nu, half_nu);
  y ~ normal(mu, sigma);
}
'
stan(model_code=txt, 
data = list(y=mtcars$mpg, X=cbind(mtcars$wt, mtcars$disp), 
                 K=2, N=nrow(mtcars),
                 scale_alpha=10, scale_beta=c(10, 10), loc_sigma=1), 
                 chains=1, seed=1)

My sessionInfo()

R version 3.6.2 (2019-12-12)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 19.04

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.5.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8     LC_MONETARY=en_GB.UTF-8   
 [6] LC_MESSAGES=en_GB.UTF-8    LC_PAPER=en_GB.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rstan_2.19.2       ggplot2_3.2.1      StanHeaders_2.19.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.2         rstudioapi_0.10    magrittr_1.5       tidyselect_0.2.5   munsell_0.5.0      colorspace_1.4-1   R6_2.4.0          
 [8] rlang_0.4.0        dplyr_0.8.3        tools_3.6.2        parallel_3.6.2     pkgbuild_1.0.5     grid_3.6.2         gtable_0.3.0      
[15] loo_2.1.0          cli_1.1.0          withr_2.1.2        matrixStats_0.55.0 lazyeval_0.2.2     assertthat_0.2.1   tibble_2.1.3      
[22] crayon_1.3.4       processx_3.4.1     gridExtra_2.3      purrr_0.3.2        callr_3.3.1        codetools_0.2-16   ps_1.3.0          
[29] inline_0.3.15      glue_1.3.1         compiler_3.6.2     pillar_1.4.2       prettyunits_1.0.2  scales_1.0.0       stats4_3.6.2      
[36] pkgconfig_2.0.2   

Hi David! Welcome to the Stan forums!

Try changing these lines:

to:

real<lower=0> sigma; 
vector<lower=0>[N] inv_lambda2; 
real<lower=0> nu;

Cheers!
Max

1 Like

Great, thanks Max, that allowed samples to be drawn. May I ask a quick follow up please; given that nu and inv_lambda2 are drawn from a gamma distribution, and sigma from the exponential, all which have support >= 0 , why are the parameter limits also required?

Hey!

I’m not an expert on HMC, so the following is based more on intuition, so take it with a pinch of salt. ;)

Stan’s initial proposals for parameter values usually come from the (-2,2) interval. That means that initially proposals for, e.g. sigma, could be negative and when evaluated in the model block, this leads to \log(0), i.e. minus infinity, evaluations. This can also be seen in the error messages that you posted. Constraining the parameters (in this case, log-transform) will ensure, that the (unconstrained) initial values will usually evaluate to something valid (not necessarily something reasonable though). The declaration of the constraint also deals with the Jacobian correction, which is necessary in this case and quietly adds it to the target.

Remember that Stan is not “drawing” from the distributions in the model block, unlike a Gibbs sampler for example, but rather evaluates a target function (usually log-likelihood, log-priors, Jacobian corrections, minus some constants usually).

I hope this makes sense.

1 Like

I need to read up as my thinking is still in the “drawing” samples. Again, thanks Max. I do appreciate your time. I

To be fair, I think most of the time it’s fine to think of “drawing” from priors. It’s just cases like this where you see the difference.

Cheers! :)