# Mixture Bayesian Poisson Regression Model

I don’t think I can really follow this. Could you post your current Stan code and describe where the data come from and what inferences you are trying to make, which questions are you trying to answer?

my current code is the following:

``````data {

int<lower=0> N;           // number of data points
int <lower=1> y[N];         // observations
matrix[N,4] x;               // Covariates
real<lower=0> mi[N];
}

transformed data {
real<lower=0> Ami;
Ami=mean(mi);
}

parameters {
real<lower=0> ci; // any Positive number added on the mu[1] to get mu[2]
vector[4] beta;
real alpha;
vector<lower=0, upper=1>[N] pi;  // mixing proportions; need to have pi's at observation level

}

transformed parameters{
vector<lower=0>[N] mu[2];    // locations of mixture components; need to have mu's for each observation
mu[1] = exp(alpha + x*beta);
mu[2] = exp((alpha + x*beta)+ci);

}

model {

// Priors
beta ~ normal(0, 1e1);
alpha ~ normal(0,1e1);
ci ~ normal(0, 1e1);
pi ~ beta(mi,Ami);     // how can we change to Dirichlet distribution using mi and mci?
// Likelihood

for (i in 1:N)
target += log_mix(pi,
poisson_lpmf(y[i] | mu[1]) - log1m_exp(poisson_lpmf(0 | mu[1])),
poisson_lpmf(y[i] | mu[2]) - log1m_exp(poisson_lpmf(0 | mu[2])));

}
``````

When I run the above code, I got this error

``````SYNTAX ERROR, MESSAGE(S) FROM PARSER:
No matches for:
log_mix(vector, real, real)
Available argument signatures for log_mix:
log_mix(real, real, real)
log_mix(real[], real[])
log_mix(real[], vector)
log_mix(real[], row vector)
log_mix(real[], vector[])
log_mix(real[], row vector[])
log_mix(vector, real[])
log_mix(vector, vector)
log_mix(vector, row vector)
log_mix(vector, vector[])
log_mix(vector, row vector[])
log_mix(row vector, real[])
log_mix(row vector, vector)
log_mix(row vector, row vector)
log_mix(row vector, vector[])
log_mix(row vector, row vector[])

error in 'modeld8755c72_ff542624806624524b7b715fc8665283' at line 56, column 82
-------------------------------------------------
54:       target += log_mix(pi,
55:                   poisson_lpmf(y[i] | mu[1]) - log1m_exp(poisson_lpmf(0 | mu[1])),
56:                   poisson_lpmf(y[i] | mu[2]) - log1m_exp(poisson_lpmf(0 | mu[2])));
^
57:
-------------------------------------------------
Error in stanc(file = file, model_code = model_code, model_name = model_name,  :
failed to parse Stan model 'ff542624806624524b7b715fc8665283' due to the above error.
In file.remove(tf) :
cannot remove file 'C:\Users\521-04\AppData\Local\Temp\Rtmpc35uis\filed82a886c34', reason 'Permission denied'
``````

I understand the problem here that the log_mix is real and it will not consider vector mixing proportion. But I need the mixing proportion at observation level and how can i handle this.

FYI: When I change the mixing proportion to `real<lower=0, upper=1> pi` it was working properly and can get the `N` amount of `mu[1] and mu[2]` , but could not get the mixing proportion at observation level.

You probably want

``````target += log_mix(pi[i], ...
``````
2 Likes

Please look the following stan code. I have checked this code with small data set and it is ok. But, when I used my complete data (around 400,000 cases) it will not start the estimation yet (it’s the 5th days). Therefore, would you please give any comment or idea how can I make it fast.

``````data {
int<lower=0> N;           // number of data points
int <lower=1> y[N];         // observations
matrix[N,4] x;               // Covariates

}

parameters {
real<lower=0> ci;                         // any Positive number added on the mu[1] to get mu[2]
real<lower=0> d;                        // will have uniform distribution that balances the shape parameters
vector[4] beta;
real alpha;
vector<lower=0, upper=1>[N] pi;           // mixing proportions
}

transformed parameters{
vector<lower=0>[N] mu[2];    // locations of mixture components
mu[1] = exp(alpha + x*beta);
mu[2] = exp((alpha + x*beta)+ci);

}

model {

// Priors
beta ~ normal(0, 1e3);
alpha ~ normal(0,1e3);
d ~ uniform(0,10000);
ci ~ normal(0, 1e3);

// Likelihood
for (i in 1:N)
target += log_mix(pi[i],
poisson_lpmf(y[i] | mu[1]) - log1m_exp(poisson_lpmf(0 | mu[1])),
poisson_lpmf(y[i] | mu[2]) - log1m_exp(poisson_lpmf(0 | mu[2])));

}
``````

It keeps this message for the last five days

``````SAMPLING FOR MODEL 'af252afcc7248409611d59094236d80a' NOW (CHAIN 1).
``````

In no particular order:

• When you have `d ~ uniform(0, 10000)`, you have to declare `real<lower=0, upper = 10000> d;` The model statement is then no longer necessary.
• In general, the Stan team discourages uniform priors. For `d` you are probably better of with a gamma or exponential style prior.
• Your priors are too wide. Especially alpha, beta, and ci. `exp(400)` overflows and `exp(-400)` underflows. Scaling of Log-likelihood value in Hamiltonian Monte Carlo
• Irrespective of the numerical issues, I would try to use more informative priors. What is the scale, you alpha, beta, and ci to be on? Can you standardize the x’s to make the betas on a scale closer to normal ~ (0, 1)?
• You can work on the log scale with `poisson_log_lpmf` which should help with numerical stability because you avoid the mu = exp() step.
• I don’t understand the poisson distribution you are using but I wonder whether there are no ways to simplify the distributions in log_mix. mu[1] and mu[2] only differ by term ci. The poisson is also relatively easy to think about at the log scale.

400 000 cases is probably just too much for Stan. While there might be some improvements to speed available, I don’t think you are likely to get Stan to work with that. The problem is not primarily the number of data points (although that is big), but the number of parameters. In a small test I did elsewhere sampling from 100 000 i.i.d. normal, which is probably the easiest model you can think of, takes 40 minutes on my PC. Your model is both more than 4 times larger and waaay more complex. A simple hierarchical model with 25 000 parameters took me about 1.5 hours, and the same model with 100 000 parameters didn’t complete in 16 hours, so that’s just to provide some context.

In my experience it is however OK to split such large datasets randomly into smaller chunks that are manageable for Stan, you just need to check that the estimates of the shared parameters are similar across the chunks (which they will be unless there is some problem with your model).

Some stuff you can do to increase speed:

• Use `poisson_log_lpmf` or `poisson_log_glm_lpmf` to avoid the exponentiation
• Put tighter priors on everything (especially the `uniform` prior on `d` might cause some trouble), I would be surprised if anything wider than `normal(0, 5)` is actually needed for anything but the intercept, provided your predictors are normalized (which is a good idea anyway).
• Check that you have enough memory - once you sart swapping to disk, you performance is dead. Note that RStan will allocate memory to store all the posterior draws from your model (which I would guess to be like 18GB/chain in your case).You can reduce that by 66% by moving `mu` from `transformed parameters` to `model`. You can use CmdStan to avoid storing draws in memory at all. You may also consider running fewer chains in parallel.
• Store the result of `alpha + x*beta`, and reuse it for `mu[2]` (or preferrably use `poisson_log_lpmf` and avoid storing `mu` at all).
• Provide good inits
2 Likes

Actually with the latest Stan additions, you just might be able to push your model to feasible region with all the datapoints, using `map_rect` for the likelihood loop (with MPI and/or threading) or offloading the `x * beta` multiplication to GPU (since 2.19, not yet in rstan). I however fear that this might be non-trivial to get working as the documentation is not quite there yet.

1 Like

I don’t understand how can I considered it.

`poisson_lpmf(y[i] | mu[1])` becomes `poisson_log_lpmf(y[i] | alpha + x*beta)`

1 Like

Wait, something is off I think. Should the original model not be

`poisson_lpmf(y[i] | mu[1, i])`

and the new one

`poisson_log_lpmf(y[i] | alpha + x[i] * beta)` ?

I am surprised that the original model worked given the mismatch in dimensions between `y[i]` and `mu[1]`. Are you recovering the parameters if you start from simulated data?

1 Like

the original likelihood

``````// Likelihood

for (i in 1:N)
target += log_mix(pi[i],
poisson_lpmf(y[i] | mu[1]) - log1m_exp(poisson_lpmf(0 | mu[1])),
poisson_lpmf(y[i] | mu[2]) - log1m_exp(poisson_lpmf(0 | mu[2])));
``````

the dimension of `y[i] and mu[1]` is similalr, `[N]` and as you said the

`poisson_lpmf(y[i] | mu[1])` becomes `poisson_log_lpmf(y[i] | alpha + x*beta)`

what about `log1m_exp(poisson_lpmf(0 | mu[1])` part?
Will it have `log1m_exp(poisson_log_lpmf(0 | alpha + x*beta)` form?

That is not correct I think. `y` is of length N but `y[i]` is of length 1.

I understand you concern. I want to say `y` and `mu` have the same dimension. But in my understand, you are write that how `poisson_lpmf(y[i] | mu[1])` was working.
now will use

``````for (i in 1:N)
target += log_mix(pi[i],
poisson_log_lpmf(y[i] | alpha + x*beta) - log1m_exp(poisson_log_lpmf(0 | alpha + x*beta)),
poisson_log_lpmf(y[i] | alpha + x*beta+ci) - log1m_exp(poisson_log_lpmf(0 | alpha + x*beta+ci)));
``````

therefore, does your concern will have an effect in this analysis? I think the above expression is similar to what you stated before, `poisson_lpmf(y[i] | mu[1, i])` .

I have also one question. Is there any options that can express `mu[1] < mu[2]`?
FYI- In my case I used `mu[2]=mu[1]+ci`.

I think you are fine with this approach. For more complicated models, you could use the `ordered` type.

I think what you need is `(poisson_lpmf(y[i] | alpha + x[i] * beta)` otherwise you are using all data `x` for every single outcome `y[i]`. Maybe that is the goal in your analysis but it implies that you have the same expected value for each `y[i]`.

I don’t understand this statement.

Let me justify the model. It is the mixture of truncated Poisson (it was zero truncated, but now I changed it to one truncated). Each observation will have probability `pi[1]` to be in the first group and `pi[2]` in the second group. Then, after I got `pi` from the posterior,I will categorize each observation according to `pi's` value (eg- below 0.6,will categorize in group 1 and so on).
As I mentioned above the model is change to 1 truncated. Therefore can i write it like this?

`````` for (i in 1:N)
target += log_mix(pi[i],
poisson_lpmf(y[i]|mu[1,i])-log1m_exp(poisson_lpmf(0|mu[1,i]))-log1m_exp(poisson_lpmf(1|mu[1,i])),
poisson_lpmf(y[i]|mu[2,i])-log1m_exp(poisson_lpmf(0|mu[2,i]))-log1m_exp(poisson_lpmf(1|mu[2,i])));
``````

I think this should work for well behaved data.

I would also use poisson_log_lpmf instead of poisson_lpmf, tighter priors, and define mu in the model block. You could still run into the memory issues that Martin explained above.

would you please show me how can I write in my case?

``````data {
int<lower=0> N;           // number of data points
int <lower=1> y[N];         // observations
matrix[N,4] x;               // Covariates
vector[N] z;                //shape parameter
real<lower=0> Az;           //shape parameter
}

parameters {
real<lower=0> ci;                         // any Positive number added on the mu[1] to get mu[2]
real<lower=0> d;                        // will have uniform distribution that balances the shape parameters
vector[4] beta;
real alpha;
vector<lower=0, upper=1>[N] pi;           // mixing proportions
}

transformed parameters{
}

model {
vector<lower=0>[N] mu[2];    // locations of mixture components
mu[1] = alpha + x*beta;
mu[2] = mu[1] + ci;
// Priors
alpha ~ normal(0,2.5);
d ~ normal(0,5);
ci ~ normal(0, 2.5);
pi ~ beta(z*d,Az*d); // a beta(100, 100) places practically all probability mass between .4 and .6

// Likelihood
for (i in 1:N)
target += log_mix(pi[i],
poisson_log_lpmf(y[i] | mu[1, i]) - log1m_exp(poisson_log_lpmf(0 | mu[1, i])),
poisson_log_lpmf(y[i] | mu[2, i]) - log1m_exp(poisson_log_lpmf(0 | mu[2, i])));

}
``````

I haven’t tested it but this is a sensible implementation if I understand you correctly.

2 Likes

Thank you @stijn for your constructive comments. The command is working now. Let me show you some values of `Y` and `mu[1]`. Do you think that the estimated Poisson rate parameter is health (seems underestimated?)?

``````     y      mu[1]
1  104      9.17
2   83      10.77
3   78      9.07
4   60      7.39
5   45      4.82
6   45      8.69
7   41      4.15
8    2      0.70
9    2      0.73
10   2      0.41
11   2      0.84
12   2      0.62
13   2      0.96
14   2      0.73
15   2      0.74
16   2      1.22
``````

Therefore, would you please give me some suggestions over it.

Is it ok if I consider truncated at 1, `(y>1)`

``````for (i in 1:N)
target += log_mix(pi[i],
poisson_lpmf(y[i]|mu[1,i])-log1m_exp(poisson_lpmf(0|mu[1,i]))
-log1m_exp(poisson_lpmf(1|mu[1,i])),
poisson_lpmf(y[i]|mu[2,i])-log1m_exp(poisson_lpmf(0|mu[2,i]))
-log1m_exp(poisson_lpmf(1|mu[2,i])));
``````