Poisson Likelihood with normal priors

blurpig_43515 · December 11, 2020, 11:13am

Hi everyone,

I have been working on an example on stan, (code below) and i receive this notification.

data {
  int<lower = 0>J;
  real<lower = 0> Y2_Npop[J];
  int<lower=0>Y2_ncases[J];
}

// The parameters accepted by the model. Our model
// accepts two parameters 'mu' and 'sigma'.
parameters {
  vector[J]theta1;
  real theta2_1;
  real<lower =0> theta2_2;
  
}

transformed  parameters {
  real lambda[J];
  for (i in 1:J) {
    lambda[i] = (Y2_Npop[i]/1000)* exp(theta2_1 + theta1[i]*theta2_2);
  }
  
}
// The model to be estimated. We model the output
// 'y' to be normally distributed with mean 'mu'
// and standard deviation 'sigma'.
model {
 target += normal_lpdf(theta2_1|0,1);
 target += normal_lpdf(theta2_2|0,1);
 for(i in 1:J){
 target += poisson_lpmf(Y2_ncases[i]|lambda[i]);}
  

}

This is the message i receive:

Warning messages:
1: In system(paste(CXX, ARGS), ignore.stdout = TRUE, ignore.stderr = TRUE) :
  '-E' not found
2: There were 4000 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded 
3: Examine the pairs() plot to diagnose sampling problems
 
4: The largest R-hat is 2.59, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat 
5: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess 
6: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess

Not sure if this is due to the data or if my stan code is inefficient.

May i ask if there is a more efficient way to code this in stan? Thank you!!

Trung_Dung_Tran · December 11, 2020, 11:58am

In this discussion, I see the first warning, maybe it is related to the installation.

Have you tried to fit the model with a simulated dataset? My experience when I try a new model is to fit it with a simulated data (with varying sample sizes). If Stan works properly, it means that the reason is not because of Stan.

P/S: If I am correct, specification by, e.g., y\sim dist() is equivalent to using target += dist_lpdf. However, for me, the former is more readable.

blurpig_43515 · December 11, 2020, 12:28pm

Hello, thank you for your reply!

I dont think it is an installation issue because it works fine with other models!

Yes i have tried with a dataset and it works, just that the density of theta is exponentially big.

Max_Mantei · December 11, 2020, 12:59pm

Hey there! I think you need a prior for theta1, and by the looks of it it should be a standard normal prior. Also, I think you can get rid of all the loops in your model code. I think this should work (but I did not test it):

data {
  int<lower = 0>J;
  real<lower = 0> Y2_Npop[J];
  int<lower=0>Y2_ncases[J];
}
generated data{
  // computations in this block are only performed once
  vector[J] offset = log(Y2_Npop/1000); // taking the log to be able to pull the offset inside the exp() function
}
parameters {
  vector[J] theta1;
  real theta2_1;
  real<lower = 0> theta2_2;
}
transformed  parameters {
  vector[J] lambda = exp(offset + theta2_1 + theta1*theta2_2); // you can pull the offset into the exp()
}
model {
  target += normal_lpdf(theta1 | 0, 1);
  target += normal_lpdf(theta2_1| 0, 1);
  target += normal_lpdf(theta2_2| 0, 1);
  target += poisson_lpmf(Y2_ncases | lambda);
}

Cheers,
Max

blurpig_43515 · December 11, 2020, 5:46pm

Hi Max, thank you for your reply and for your feedback! I have tried your code and some issues have surfaced.

Apparantly i dont think there is such thing as generated data in stan? (Correct me if im wrong), but i have transformed the data in R itself so that takes care of that.
I have tried your code and apparantly the model does not contain samples.
-(Sorry my priors for theta 2 are N(0,1000) instead of N(0,1). And as for theta1, the prior is beta(1,1) which is equivalent to a uniform prior hence i did not put the prior in.

data {
  int<lower = 0>J;
  vector[J] Y2_Npop;
  int<lower=0>Y2_ncases[J];
}

parameters {
  vector[J] theta1;
  real theta2_1;
  real theta2_2;
}

transformed  parameters {
  vector[J] lambda = exp(Y2_Npop + theta2_1 + theta1*theta2_2); // you can pull the offset into the exp()
}

model{
  target += normal_lpdf(theta2_1| 0, sqrt(1000));
  target += normal_lpdf(theta2_2| 0, sqrt(1000));
  target += poisson_lpmf(Y2_ncases | lambda);
}

[[1]]
Stan model ‘hpv_model2c’ does not contain samples.

[[2]]
Stan model ‘hpv_model2c’ does not contain samples.

[[3]]
Stan model ‘hpv_model2c’ does not contain samples.

[[4]]
Stan model ‘hpv_model2c’ does not contain samples.

Max_Mantei · December 11, 2020, 7:07pm

Yes, you are right! I got that wrong! the block should have been called transformed data.

IMO the N(0,1) priors make more sense, since you are working on the log scale. When working on the log scale even N(0,5) priors are really uninformative. Stan will probably also work with N(0, sqrt(1000)) priors, but it is really not recommended. Also, see here (although log-scale priors are not specifically mentioned).

Previously you had theta1*theta2_2 where theta2_2 had a lower bound, so I thought you meant to model a random effect in non-centered parametrization, that’s why I suggested a standard normal prior for theta1. If you want to impose a uniform prior on the theta1 parameters you have to constrain them by declaring them as vector<lower=0,upper=1>[J] theta1; in the parameters block (then you don’t have to explicitly add the Beta(1,1) prior).

Cheers,
Max

Trung_Dung_Tran · December 11, 2020, 7:19pm

I mean whether you have tried with a simulated dataset where you know the true values of the parameters. If that is a simulated dataset with large (enough) sample size, and you still get \theta which is “exponentially big”, then I think it might be that \theta seems not be identifiable.

blurpig_43515 · December 13, 2020, 3:25pm

Hi Max,

I think this solved it:
vector<lower=0,upper=1>[J] theta1.

Thank you so much!

Topic		Replies	Views
Having trouble recovering known parameters from simulated data Modeling cmdstanpy	2	648	June 4, 2021
Questions about Maximum Likelihood Estimation in Stan Modeling	1	1060	July 31, 2019
Prior Predictive Check From Tutorial Modeling fitting-issues , prior-predictive	1	494	June 6, 2022
Hierarchical Bayesian Poisson regression model Modeling fitting-issues	7	2383	August 17, 2019
Issues with modelling Poisson Data General fitting-issues	5	875	February 18, 2022

Poisson Likelihood with normal priors

Related topics