Eight schools model generates three stanc warnings

The canonical eight schools model generates three stanc warnings:

Warning:
  The parameter mu has no priors.
Warning:
  The parameter tau has no priors.
Warning:
  The parameter theta was declared but was not used in the density
  calculation.

This model is used in the RStan and PyStan documentation. It’s the first model which new users encounter.

I think it would be better if the first model users encounter did not generate warnings.

Here’s the model, for reference:

// saved as schools.stan
data {
  int<lower=0> J;         // number of schools 
  real y[J];              // estimated treatment effects
  real<lower=0> sigma[J]; // standard error of effect estimates 
}
parameters {
  real mu;                // population treatment effect
  real<lower=0> tau;      // standard deviation in treatment effects
  vector[J] eta;          // unscaled deviation from mu by school
}
transformed parameters {
  vector[J] theta = mu + tau * eta;        // school treatment effects
}
model {
  target += normal_lpdf(eta | 0, 1);       // prior log-density
  target += normal_lpdf(y | theta, sigma); // log-likelihood
}
1 Like

That warning isn’t even correct.

To clarify, these are pedantic mode warnings, that are printed if --warn-pedantic is used.

And as @bgoodri says, the theta one is definitely not correct. @ariddell mind reporting that one to the stanc3 github, tho maybe @rybern would know what is up with this case.

2 Likes

I should have mentioned this. PyStan 3 has --warn-pedantic on by default. Would be great to find a solution to this issue which doesn’t require turning it off (in general, or for this specific model).

1 Like

Aside from correcting the theta message, I don’t understand. Are you saying you would like a separate, less pedantic mode for PyStan that didn’t pedantically warn about lack of priors? What are the parts of pedantic mode that you do want?

I’m looking into the theta issue, it’s definitely a bug.

I did not know that --warn-pedantic mode was turned on by default. I actually didn’t think that was ever going to be the intention. It would be very easy to split up the warnings and have some of them on by default, but I don’t think it’s a good idea to have them all on all the time.

Edit:
PR submitted.

Even if pedantic mode were intended to be turned on by default, I still wouldn’t do it yet. I still consider it to be experimental because it really hasn’t been stress tested, I’m sure there are still bugs. I’d feel better if I got more feedback/bug reports.

I think it is unfortunate that the canonical model generates more than zero messages which bear the label “warning”.

If there’s nothing to be done about this, then there’s nothing to be done.

1 Like

Agreed.

+1, I think pedantic mode was always intended to be, well, pedantic and very conservative in what it warns about. We planned on putting in messages that might not even indicate something is wrong, but would give the user something to consider.

Low confidence but it seems like we should update the example model to add explicit priors for mu and tau - I don’t think our current best practices would suggest using implicit default uniform priors, and I have seen later editions of the model with explicit priors on mu and tau floating around. @betanalpha, @Bob_Carpenter, others - thoughts on this?

Separately, I believe we were all thinking that pedantic mode would not be the default in any interface because it is intended to be so conservative.

2 Likes

I’m going to disagree slightly with @ariddell, but only because of the nature of the “pedantic mode”. The pedantic mode warnings were not a set of critical warnings agreed to by the entire community but rather an aggregation of many ideas that included not just statistical advice but also style advice and other ideas that really shouldn’t be forced upon all users. Unfortunately it’s just too much for a new user to be able to understand which warnings are critical and which are more suggestive and react accordingly, and because of that I think a default pedantic-mode will be more trouble than it’s worth.

That said the test “eight schools” model should defiantly be updated to include priors on the population parameters \mu and \tau. This test model has always lagged behind best practices – for example it was stuck on a centered-parameterization for waaaaaay to long – and we shouldn’t hesitate to update it.

2 Likes

I have the same bug:

Warnings from stanc:
Warning:
  The parameter sigma_n has no priors.
Warning:
  The parameter sigma_o has no priors.

with the following model:

data {
  int<lower=0> N;   // number of data items
  matrix[N, N] Z;   // predictor matrix
  int y[N];      // outcome vector
  real mu_n;
  real mu_o;
  
}


parameters {
  real<lower=0> sigma_n;  // error scale
  real<lower=0> sigma_o;  // cov error scale
  vector[N] p_tilde;
}

transformed parameters {
    matrix[N, N] L_K;
    matrix[N, N] K = square(sigma_o)*Z;
    vector[N] p; 
    real sq_sigma = square(sigma_n);

    // diagonal elements
    for (n in 1:N)
        K[n, n] = K[n, n] + sq_sigma + 1e-8;

    L_K = cholesky_decompose(K);
    p = L_K * p_tilde;     
}

model {  
    sigma_n ~ normal(0,mu_n/0.8);
    sigma_o ~ normal(0,mu_o/0.8);
    p_tilde ~ normal(0,1);
    
    y ~ bernoulli_logit(p);
  
}

generated quantities{
    real sig_sqr_g = square(sigma_o);
    real sig_sqr_n = square(sigma_n);

}

pystan 3.0.0b4
pystan-jupyter 0.1b3
Linux

Only an idea, if this

is really the case, I do agree that “warning” notification is too strong. Why not simply renaming them as “hints:”?

4 Likes

I thought --warn-pedantic was a “killer feature” of stanc3. I assumed it was going to be on by default (after exiting an experimental phase). The paradigmatic example of--warn-pedantic for me was telling the user that they had attached a beta distribution to a variable which was not constrained to be between 0 and 1.

Reported on the Github issue tracker. Thanks.

1 Like

Hmm. What do y’all think about, for now, updating the example model and pystan can keep pedantic mode turned on if desired? We can revisit this if we get more examples of messages that are too pedantic and don’t actually indicate we should update the model in question. Then we could either move the broadly acceptable ones into the default path or create another —extra-pedantry mode for these?

The issue here is that the distributions assigned to sigma_n and sigma_o depend on mu_n and mu_o, which are data variables. Factors that depend on data variables get classified as likelihood terms rather than prior terms.

The core issue is that mu_n and mu_o are being used as hyperparameters rather than data variables, but we don’t have a convenient way of specifying hyperparameters other than labeling them as data variables.

This is limitation that I’m aware of but can’t really fix. One workaround is to not treat non-container types as data variables. I’m already doing this for ints because they’re so often size variables.

2 Likes

I think there are some warnings in pedantic mode that really aren’t suited to being on by default, and some that might be. The distribution usage warning that @ariddell mentioned could be on all the time, but e.g. the prior warning can give unintuitive results (see @Nadav’s example above).

My vote would be to do what you’re suggesting and split up the warnings into default and non-default (and maybe also an interface to turn them on and off individually), but in the mean time turn it off by default.

2 Likes

Thanks. Hardcoding those hyper-parameters can work for now.
In pystan2 this would mean I would have to recompile the model for every hyper-parameter. I guess that will be the case here as well.

Also, why does --warn-pedantic stops compilation instead of just providing a warning?

Thanks!

--warn-pedantic does not stop compilation. Or it shouldn’t. If it does, it may be a bug against pystan 3 and not against stanc3.

2 Likes

It does. Where can report it?

PyStan 3 issues should be reported here: https://github.com/stan-dev/pystan-next/issues

Thanks!