'Pedantic Mode' is ready for your testing and feedback

Was just working on a model and did:

real<lower = 0.0, upper = 0.0> f;

(lower and upper were the same thing). I think this should throw a warning.

Also:

real<lower = 0.0, upper = -1.0> f;

Should probably just be an error regardless of this.


The parameter f was declared but does not participate in the model.

“The parameter f was declared but is not used in the model.”


The parameter sigma has 0 priors.

“The parameter sigma has no prior.” But this also works if you give sigma two priors, so I dunno, maybe just keeping it simple is better.


Warning:
  The parameter sigma has 2 priors.
Warning at '/tmp/RtmpMOiqOH/model-3e237eb0f24e.stan', line 11, column 2 to column 23:
  The parameter sigma is on the left-hand side of more than one twiddle
  statement.

Do these need to be separate?


For the code:

functions {
  real func(real b) {
    if(b > 0.0) {
      return(1.0);
    } else {
      return(0.0);
    }
  }
}
data {
  int N;
  real x[N];
}
parameters {
  real<lower = 0.0> sigma;
  real<lower = 0.0> f;
}
model {
  real mu;
  x ~ normal(mu, func(sigma));
}

I don’t get a warning about the if statement on b (this comes up in ODE stuff a lot).


The model here has a few things going on.

Here’s a copy-paste model:

data {
  int<lower=0> N;      // number of data items
  int<lower=0> K;      // number of predictors
  int<lower=0> J;      // number  of classrooms
  int<lower=0> S;      // number  of schools
  vector[N] y;         // outcome vector
  vector[K] x[N];      // predictor matrix
  int classroom[N];    // numeric of classroom identifier
  int school[N];       // numeric of school identifier
}

parameters {
  real alpha;                   // intercept
  vector[K] beta;               // coefficients for predictors
  real<lower=0> sigma;          // error sd
  vector[J] b_raw;              // classroom random effects
  real<lower=0> tau;            // classroom sd
  vector[S] c_raw;              // school random effects
  real<lower=0> nu;             // school sd
}

transformed parameters {
  vector[J] b = b_raw * tau;
  vector[S] c = c_raw * nu;
}

model {
  real yHat[N];
  for(i in 1:N){
    yHat[i] = alpha + dot_product(x[i], beta) + b[classroom[i]] + c[school[i]];
    tau ~ normal(0,1);
    nu ~ normal(0,1);
    sigma ~ normal(0,1);
  }
  y ~ normal(yHat, sigma);       // likelihood
  b_raw ~ normal(0, 1);
  c_raw ~ normal(0, 1);
  beta ~ normal(0,1); 
}

Warnings are:

Compiling Stan program...
Warning:
  The parameter alpha has 0 priors.
Warning:
  The parameter b has 0 priors.
Warning:
  The parameter c has 0 priors.
Warning:
  The parameter nu has 0 priors.
Warning:
  The parameter sigma has 0 priors.
Warning:
  The parameter tau has 0 priors.
Warning at '/tmp/RtmpMOiqOH/model-3e233ba4a84f.stan', line 37, column 13 to column 17:
  The variable yHat may not have been initialized before its use.

tau, nu, and sigma all have priors if N > 1. I guess we’d need to assume the loops run?

b and c are transformed parameters and so by default I don’t think we’d assume they need priors.


It is suggested to replace lkj_corr with lkj_corr_cholesky, the Cholesky
  factor variant. lkj_corr tends to run slower, consume more memory, and has
  higher risk of numerical errors.

I think the first sentence should be “Reparameterizing model to use lkj_corr_cholesky instead of lkj_corr can help the model run faster with less risk of numerical errors”. Somehow it should emphasize that the model might change.


A distribution argument 0.001 is less than 0.1 or more than 10 in
  magnitude. This suggests that you might have parameters in your model that
  have not been scale to roughly order 1. We suggest rescaling using a
  multiplier; see section 22.12 of the manual for an example.

I’d make the small/large thresholds smaller/larger at least. Maybe at least 0.01 and 100.


Looking at the model here (with some stuff commented out to avoid syntax errors):

functions {
  real smvn_lpdf(vector as, vector mus, matrix covmats, vector omega, real delta, 
                 matrix all_Xs, matrix all_X2, vector all_P) { 
    
    // Define additional local variables
    vector[256] IMP; 
    real IV; 
    real logK; 
    real logW;  
    real lprob;  
    
    IMP = 1/(2*delta)*all_X2*omega - 1/delta*all_P;
    IV = log_sum_exp(1/delta*all_Xs*as + IMP);
    logK = log_sum_exp(IMP + 1/delta*all_Xs*mus 
                       + 1/(2*delta^2)*diagonal(all_Xs*covmats*all_Xs));
    logW = IV - logK;
    
    // calculate the log likelihood value
    lprob = logW + multi_normal_lpdf(as | mus, covmats);
    
    // return the log likelihood value
    return lprob;
  }
}  

data {
  int<lower=1> J; 
  int<lower=1> R; 
  int<lower=1> S[R]; 
  int<lower=3> C; 
  int<lower=1> W; 
  int<lower=1> N; 
  vector<lower=1,upper=C>[N] Y;  
  matrix[N*3, J-1] Xs;  
  matrix[N*3, W] X2; 
  vector[N*3] P;  
  matrix[256, J-1] all_Xs; 
  matrix[256, W] all_X2; 
  vector[256] all_P; 
}

parameters {
  vector[J-1] mus;
  vector<lower=0>[J-1] sigmas;
  corr_matrix[J-1] Thetas;
  vector[W-1] omegas;
  real<lower = 0> delta;
}

transformed parameters {
  vector[J] omega = append_row(omegas, -sum(omegas));
  cov_matrix[J-1] covmats = quad_form_diag(Thetas, sigmas);  
}

model {
  int yr;
  int xr;
  yr = 1;
  xr = 1;
  
  // hyperpriors
  mus ~ normal(0, 1000);
  sigmas ~ cauchy(0, 2.5); 
  Thetas ~ lkj_corr(2);
  
  // priors
  omegas ~ normal(0, 1000);
  delta ~ gamma(0.001, 1000);
  
  // likelihood
  for (r in 1:R) { // for each individual
    int ypos;
    int xpos;
    vector[J-1] as;
    vector[J-1] alphas;

    ypos = yr;
    xpos = xr;
    as ~ multi_normal(mus, covmats);
    //target += smvn_lpdf(alphas | mus, covmats, omega, delta, all_Xs, all_X2, all_P);
    
    for (s in 1:S[r]) { // for each choice task
    //Y[ypos] ~ categorical_logit((block(Xs, xpos, 1, 2, 6)*alphas+block(X2, xpos, 1, 2, 27)*omega-1/delta*segment(P, xpos, C))'); 
    ypos = ypos + 1;
    xpos = xpos + C;
    }
    yr = yr + S[r];
    xr = xr + S[r]*C;
  }
}

For the snippet in the model block:

vector[J-1] as;
...
as ~ multi_normal(mus, covmats);

The warning is:

Warning at '/tmp/RtmpMOiqOH/model-3e231c50f7f3.stan', line 80, column 24 to column 30:
  The variable alphas may not have been initialized before its use.

I like that this is a warning (wasn’t caught in the original thread!) but maybe the message can be more specific?

There are also these warnings which I think aren’t right:

Warning:
  The parameter Thetas has 0 priors.
Warning:
  The parameter covmats has 0 priors.
Warning:
  The parameter sigmas has 0 priors.

which I think aren’t quite right.

Thetas and sigmas have priors and covmats shouldn’t have a prior.


Enough for now. Will probably try some more later. This is very handy! I found at least one bug on some model searching the forums for examples (unconstrained standard deviation).

I wonder about the aggressive warnings on variables initialized in loops though.

There were a lot of things where code like this:

real a[N];
for(i in 1:N) {
  a[i] = whatever(i);
}

threw warnings that a might not be initialized.

Edit: And there were lots of these. Maybe these warnings should be less aggressive?

2 Likes