Stan says "no priors" in model

Jacob_Moore · April 19, 2023, 8:11pm

Stan doesn’t see any priors here, but I clearly do have a beta prior on theta…

data {
  int<lower=0> N; // number of observations
  int<lower=0, upper=100000> k[N]; // observed number of successes
  int<lower=0, upper=100000> n[N]; // observed number of trials  
  real<lower=0, upper=10> alpha; // hyperparameter for beta prior
  real<lower=0, upper=10> beta; // hyperparameter for beta prior
}

parameters {
  real<lower=0, upper=1> theta; // probability of success
}

model {
  // priors
  theta ~ beta(alpha, beta);
  
  // likelihood
  for (i in 1:N){
      k[i] ~ binomial(n[i], theta);
  }
  
}
"""

And yet I get this warning: "Warning: The parameter theta has no priors. This means either no prior is provided, or the prior(s) depend on data variables. In the later case, this may be a false positive."

WardBrian · April 19, 2023, 8:13pm

This warning is produced by “pedantic mode” which is known to have false positives. In particular, it doesn’t consider a ~ statement which includes variables from data as priors, which clearly they can be. You can ignore this warning

Are you using PyStan? I believe that is the only interface which enables pedantic mode by default.

Jacob_Moore · April 19, 2023, 8:22pm

Indeed I am using PyStan. I ignored this warning and plowed ahead, getting this error:

“RuntimeError: Exception during call to services function: ValueError("Initialization failed. Rejecting initial value: Error evaluating the log probability at the initial value. Exception: binomial_lpmf: Successes variable is 1199, but must be in the interval [0, 1051] (in '/tmp/httpstan_jd0xv6_k/model_i7bmbwl7.stan', line 20, column 6 to column 35) Rejecting initial value: ..."), traceback: [' File "/home/ec2-user/anaconda3/envs/python3/lib/python3.10/asyncio/tasks.py", line 232, in __step\n result = coro.send(None)\n', ' File "/home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages/httpstan/services_stub.py", line 185, in call\n future.result()\n', ' File "/home/ec2-user/anaconda3/envs/python3/lib/python3.10/asyncio/futures.py", line 201, in result\n raise self._exception.with_traceback(self._exception_tb)\n']”

Seems to believe that the success variable maxes out at 1051, which I defined otherwise, so a little confused.

The reason that I started with the build warning is I thought that it was upstream of this sampling error.

WardBrian · April 19, 2023, 8:27pm

I don’t see where you’ve defined this in the model. You state that n and k are both less than 100,000, but this doesn’t imply any relationship between k and n

Jacob_Moore · April 19, 2023, 8:41pm

Is the error due to a relationship with k and n?
I interpret the error as simply a k problem. And the 100k limit should resolve.

And I’ve validated that k[i] < n[i] for all i in observations.

WardBrian · April 19, 2023, 8:55pm

The error you’re getting is thrown when the first argument to binomial_lpdf is greater than the second, by this line: math/binomial_lpmf.hpp at 6cd15b88d16dbace9ee4ce9d27997901159c44e7 · stan-dev/math · GitHub, so it seems like an issue with the input data rather than the model code

You can encode that k[i]<n[i]directly in the model by changing your data block:

  int<lower=0, upper=100000> n[N]; // observed number of trials
  int<lower=0, upper=n> k[N]; // observed number of successes

Jacob_Moore · April 19, 2023, 9:00pm

Yeah no rows were returned when I check df[df['k'] > df['n']]. Data seems fine! Maybe using Jupyter notebook on AWS is the issue. Haven’t had a problem before but who knows?

Bob_Carpenter · April 19, 2023, 9:33pm

Can we just remove this rule? We very often have users who define their prior parameters in the data block, so this is just going to be producing a massive number of false positives for all those models.

WardBrian · April 19, 2023, 9:44pm

I believe @rybern had a suggestion for how to improve it but I can’t find it now. If I recall correctly having some way of differentiating between prior and likelihood was important for many of the pedantic mode analyses, and using “does it touch data” was chosen as the criteria

rybern · April 19, 2023, 10:29pm

That’s right - Pedantic Mode guesses what’s a prior based on what touches variables in data, so it’ll get confused when data variables are actually hyperparameters like this. Ideally we’d have something like an annotation or separate block to distinguish ‘true’ data variables, but alas.

One option would be to mention this issue in the warning message.

WardBrian · April 20, 2023, 12:34pm

The current text (… or the prior(s) depend on data variables. In the later case, this may be a false positive.) was an attempt at exactly that, but it might still come off too strongly as a warning

rybern · April 21, 2023, 4:28pm

Oh nice, I didn’t notice! It makes sense to me to keep the warning with this wording or similar. We could consider making it even softer by saying “The parameter theta may have no priors.”

Topic		Replies	Views
Misleading Warning when providing prior parameters as data PyStan	4	590	May 5, 2022
How to understand error "The parameter <parameter_name> has no priors" in Pystan 3.x? Modeling	4	1012	February 17, 2022
Eight schools model generates three stanc warnings Developers stanc	19	1496	October 27, 2020
Issue with Beta Priors in ODE model Modeling	2	652	February 14, 2020
Warning: The parameter ... has 2 priors Developers	4	1035	March 28, 2022

Stan says "no priors" in model

Related topics