Mixture Bayesian Poisson Regression Model

Actually with the latest Stan additions, you just might be able to push your model to feasible region with all the datapoints, using map_rect for the likelihood loop (with MPI and/or threading) or offloading the x * beta multiplication to GPU (since 2.19, not yet in rstan). I however fear that this might be non-trivial to get working as the documentation is not quite there yet.

1 Like

I don’t understand how can I considered it.

poisson_lpmf(y[i] | mu[1]) becomes poisson_log_lpmf(y[i] | alpha + x*beta)

1 Like

Wait, something is off I think. Should the original model not be

poisson_lpmf(y[i] | mu[1, i])

and the new one

poisson_log_lpmf(y[i] | alpha + x[i] * beta) ?

I am surprised that the original model worked given the mismatch in dimensions between y[i] and mu[1]. Are you recovering the parameters if you start from simulated data?

1 Like

the original likelihood

// Likelihood

  for (i in 1:N) 
      target += log_mix(pi[i],
                  poisson_lpmf(y[i] | mu[1]) - log1m_exp(poisson_lpmf(0 | mu[1])),
                  poisson_lpmf(y[i] | mu[2]) - log1m_exp(poisson_lpmf(0 | mu[2])));

the dimension of y[i] and mu[1] is similalr, [N] and as you said the

poisson_lpmf(y[i] | mu[1]) becomes poisson_log_lpmf(y[i] | alpha + x*beta)

what about log1m_exp(poisson_lpmf(0 | mu[1]) part?
Will it have log1m_exp(poisson_log_lpmf(0 | alpha + x*beta) form?

That is not correct I think. y is of length N but y[i] is of length 1.

I understand you concern. I want to say y and mu have the same dimension. But in my understand, you are write that how poisson_lpmf(y[i] | mu[1]) was working.
now will use

for (i in 1:N) 
      target += log_mix(pi[i],
                  poisson_log_lpmf(y[i] | alpha + x*beta) - log1m_exp(poisson_log_lpmf(0 | alpha + x*beta)),
                  poisson_log_lpmf(y[i] | alpha + x*beta+ci) - log1m_exp(poisson_log_lpmf(0 | alpha + x*beta+ci)));

therefore, does your concern will have an effect in this analysis? I think the above expression is similar to what you stated before, poisson_lpmf(y[i] | mu[1, i]) .

I have also one question. Is there any options that can express mu[1] < mu[2]?
FYI- In my case I used mu[2]=mu[1]+ci.

I think you are fine with this approach. For more complicated models, you could use the ordered type.

I think what you need is (poisson_lpmf(y[i] | alpha + x[i] * beta) otherwise you are using all data x for every single outcome y[i]. Maybe that is the goal in your analysis but it implies that you have the same expected value for each y[i].

I don’t understand this statement.

Let me justify the model. It is the mixture of truncated Poisson (it was zero truncated, but now I changed it to one truncated). Each observation will have probability pi[1] to be in the first group and pi[2] in the second group. Then, after I got pi from the posterior,I will categorize each observation according to pi's value (eg- below 0.6,will categorize in group 1 and so on).
As I mentioned above the model is change to 1 truncated. Therefore can i write it like this?

 for (i in 1:N) 
  target += log_mix(pi[i],
                    poisson_lpmf(y[i]|mu[1,i])-log1m_exp(poisson_lpmf(0|mu[1,i]))-log1m_exp(poisson_lpmf(1|mu[1,i])),
                    poisson_lpmf(y[i]|mu[2,i])-log1m_exp(poisson_lpmf(0|mu[2,i]))-log1m_exp(poisson_lpmf(1|mu[2,i])));

I think this should work for well behaved data.

I would also use poisson_log_lpmf instead of poisson_lpmf, tighter priors, and define mu in the model block. You could still run into the memory issues that Martin explained above.

would you please show me how can I write in my case?

data {
  int<lower=0> N;           // number of data points
  int <lower=1> y[N];         // observations
  matrix[N,4] x;               // Covariates 
  vector[N] z;                //shape parameter
  real<lower=0> Az;           //shape parameter 
}

parameters {
  real<lower=0> ci;                         // any Positive number added on the mu[1] to get mu[2] 
  real<lower=0> d;                        // will have uniform distribution that balances the shape parameters
  vector[4] beta;
  real alpha;
  vector<lower=0, upper=1>[N] pi;           // mixing proportions
}

transformed parameters{
}

model {
  vector<lower=0>[N] mu[2];    // locations of mixture components
  mu[1] = alpha + x*beta;
  mu[2] = mu[1] + ci;
  // Priors
  beta ~ normal(0, 2.5); // adapt to your specific knowledge
  alpha ~ normal(0,2.5);
  d ~ normal(0,5);    
  ci ~ normal(0, 2.5);
  pi ~ beta(z*d,Az*d); // a beta(100, 100) places practically all probability mass between .4 and .6
  
  // Likelihood
  for (i in 1:N) 
    target += log_mix(pi[i],
                      poisson_log_lpmf(y[i] | mu[1, i]) - log1m_exp(poisson_log_lpmf(0 | mu[1, i])),
                      poisson_log_lpmf(y[i] | mu[2, i]) - log1m_exp(poisson_log_lpmf(0 | mu[2, i])));
  
}

I haven’t tested it but this is a sensible implementation if I understand you correctly.

2 Likes

Thank you @stijn for your constructive comments. The command is working now. Let me show you some values of Y and mu[1]. Do you think that the estimated Poisson rate parameter is health (seems underestimated?)?

     y      mu[1]
1  104      9.17
2   83      10.77
3   78      9.07
4   60      7.39
5   45      4.82
6   45      8.69
7   41      4.15
8    2      0.70
9    2      0.73
10   2      0.41
11   2      0.84
12   2      0.62
13   2      0.96
14   2      0.73
15   2      0.74
16   2      1.22

Therefore, would you please give me some suggestions over it.

Is it ok if I consider truncated at 1, (y>1)

for (i in 1:N) 
        target += log_mix(pi[i],
                    poisson_lpmf(y[i]|mu[1,i])-log1m_exp(poisson_lpmf(0|mu[1,i]))
                                              -log1m_exp(poisson_lpmf(1|mu[1,i])),
                    poisson_lpmf(y[i]|mu[2,i])-log1m_exp(poisson_lpmf(0|mu[2,i]))
                                              -log1m_exp(poisson_lpmf(1|mu[2,i])));

and please look my dread on the estimated parameter here.

One thing to note:
if you have a simulator that can simulate new data for you exactly according to your model, you can check easily if the Stan model recovers the parameters from simulated data. If not, your simulator and model do not match and at least one of the is wrong. This should help you debug your model by yourself.

Note that mu is on the log scale, so you should be comparing y to exp(mu[1]). With that it actually seems that the estimates for rows 8 - 16 are roughly OK, but some of the initial rows seems grossly overestimated e.g. exp(9.17) = 9604

In my previous post I was not speaking about using generated quantities for posterior predictive checks (although that is often useful as well), I meant writing a simulator to create fake/simulated data in R/Python/… where you now the values of alpha, beta, pi etc. (because you simulated them from the priors).

In any case, this topic is taking more time than I am willing to invest in it and I will therefore not interact here any more. It seems to me that your aims are currently higher than your skills with Stan - which is good and you should keep trying! - but it also means that you would benefit from a more intensive mentoring than I am able to sustainably provide you for free on this forum. You may consider finding an official mentor or read up on statistics/model development in general and experiment with simpler problems to improve your grasp on the foundations of Bayesian inference. My feeling (though possibly wrong) is that those foundations might be necessary for your current problem and it would increase your independence in problem solving in general. I wish you best of luck with further learning and hope my suggestions so far have helped you learn and grow.

I understand your feeling and I thank you very much what you Stan members did so far. I grasp lot of things and I will not raise more questions here.

Thank you all of you.

One more note: my last post had a narrow scope - it spoke about this particular topic and about myself - I do not and cannot speak on behalf of other memebers of the community who might feel otherwise. And from my perspective, you are certainly welcome to ask further questions on this forum as you move forward and discover specific obstacles.

1 Like

Just to add to what martin said. I think he’s right that this topic has run it’s course. I think it would be good if you could familiarize yourself a bit more with Stan and the language that people use to describe different steps in the analysis process before doing your analysis. I tried to look for an example of a poisson case study and there are some floating around on the internet but I could not directly find one that does everything.

Ideally you want to be able to

  1. Simulate data from a simple poisson distribution (outside of Stan probably)
  2. Use Stan to estimate the parameter in your simulated data and be able to diagnose bugs and warnings.
  3. Generate posterior predictions (You need the _rng functions in Stan to do this).

I hope this does not come across as too patronizing. I am happy to answer future separate questions if they pop up on discourse. It’s just that this debugging step by step on discourse cannot be the most useful way of learning Stan.

3 Likes

ok dear Stijin.
Thank you, I am agreed with your comment.