Please share your Stan program and accompanying data if possible.
Hello,
I’m trying to slowly build a GLM model with a target variable that has a quasi-poisson distribution. I’m not sure if this distribution is possible in STAN but I’m trying it with the poisson distribution. I’m getting the below error.
Semantic error in 'C:/Users/JORDAN.HOWELL.GITDIR/PycharmProjects/pythonProject/marine_bayes/model/marine_bayes.stan', line 20, column 4 to column 26:
-------------------------------------------------
18: a[n] = mu + credit_model_normalized[n]*credit_beta;
19: }
20: h_pp ~ poisson_log(a);
^
21: }
-------------------------------------------------
Ill-typed arguments to '~' statement. No distribution 'poisson_log' was found with the correct signature.
My model is as follows:
data {
int<lower=0> N; // number policy
real<lower=0> h_pp[N]; // hull pure premium
vector[N] credit_model_normalized;// normalized credit
}
parameters {
real<lower=0> mu;
real credit_beta ; //credit coefficient
}
model {
mu ~ normal(0,3);
credit_beta ~ normal(0,5);
vector[N] a;
for (n in 1:N) {
a[n] = mu + credit_model_normalized[n]*credit_beta;
}
h_pp ~ poisson_log(a);
}
instead of
model{
vector[N] mu = alpha+beta*x;
y~normal(mu,sigma);
}
Is this because I’m using the wrong distribution or another issue? If it’s a distribution issue, is it possible to run a STAN model with a quasi-poisson distribution?
The issue is that poisson
distributions only accept integer outcomes, and so aren’t compatible with the real
-type outcome h_pp
Thank you for that confirmation. This is an insurance loss data set where losses follow a quasi-poisson distribution. Does STAN have a quasi-poisson option?
Unfortunately not, your closest option for overdispersed count data in Stan would be the negative-binomial
I tried the below from the docs…
data {
int<lower=0> N; // number policy
real<lower=0> h_pp[N]; // hull pure premium
vector[N] credit_model_normalized;// normalized credit
}
parameters {
real<lower=0> mu;
real credit_beta ; //credit coefficient
}
model {
mu ~ normal(0,3);
credit_beta ~ normal(0,5);
vector[N] a;
for (n in 1:N) {
a[n] = mu + credit_model_normalized[n]*credit_beta;
}
h_pp ~ neg_binomial(a);
}
I get the following;
Ill-typed arguments to ‘~’ statement. No distribution ‘neg_binomial’ was found with the correct signature.
I’ve also tried neg_binomial_2
a and neg_binomial_2_lpmf
and get the same error. What is the correct command to use in the model block?
EDIT: @maxbiostat has edited this post for syntax highlighting.
As I mentioned earlier, methods for count data are not compatible with outcomes defined as real
. In your data block, change the definition of h_pp
to int
:
int<lower=0> h_pp[N]; // hull pure premium
Ok. I see now that I miscommunicated. When modeling insurance losses, the losses can be modeled via a poisson distribution but are real numbers (i.e. the dollar amount of the loss) so the actuarial literature states to use a quasi-poisson to model the data. (frequentist models).
There is also a large part that states to use a tweedie distribution since there is a large amount of zeros (people who have made no claims).
Is there anything like a tweedie or other zero-inflated distribution that I can use that has the signature of a poisson but excepts real numbers?
Ah I see what you mean, unfortunately not. A user has previously implemented their own tweedie distribution in Stan, which might be applicable in this case: Tweedie Likelihood (compound Poisson-gamma) in Stan