# Prior posterior distribution for Zero inflated Poisson model with Random Effects

I am trying to derive prior predictive distributions for a zero inflated poisson model with random effects.

``````data {
int<lower=0> N;
int<lower=0> y[N];
int<lower=1> N_cov;
matrix[N,N_cov] x;
int<lower=1> J;         // number of groups
int g[N];               //cluster indicator
int run_estimation; // 0 for prior predictive, 1 for estimation
}
parameters {
real<lower=0, upper=1> theta;
real alpha;
vector[N_cov] beta;
vector[J] a_raw;
real<lower=0> sigma;
}
transformed parameters{
vector[J] a;
a = a_raw * sigma;
}
model {

alpha~normal(0,2);
beta~normal(0,2);

sigma~cauchy(0,2);
a_raw~normal(0,1);

if(run_estimation==1){
for (n in 1:N) {
if (y[n] == 0)
target += log_sum_exp(bernoulli_lpmf(1 | theta),
bernoulli_lpmf(0 | theta)
+ poisson_log_lpmf(y[n] |a[g[n]]+ alpha + x[n]*beta));
else
target += bernoulli_lpmf(0 | theta)
+ poisson_log_lpmf(y[n] | a[g[n]]+ alpha + x[n]*beta);
}
}
}
generated quantities{

int ypred[N];
int zero;

for(n in 1:N){
zero=bernoulli_rng(theta);

ypred[n]=(1-zero)*poisson_log_rng(a[g[n]]+ alpha +x[n]*beta);
}
}
``````

When I run the model with run_evaluation=0 to get prior predictive distributions I get the following
exception

Exception: Log rate parameter is 24.2495, but must be less than 20.79

together with these warnings

The same happens if I reduce the prior variances.

When I fit the model in estimation mode (run_estimation=1), the data seem to have enough information to prevent huge coefficients and I don’t get the exception. I still get the warnings though.

This problem gets even worse with a more complicated model with mixtures of zero inflated Poissons.
There the probability of each component turns to be NA given the high values of coefficients in the poisson model.

Any suggestion?

Did you try replacing the Cauchy prior with something tighter? Like a normal or something? Cauchy’s have heavy tails, and so it’s super conceivable that it would blow up a log scale.

For instance, sampling 20, cauchy(0, 2)s in R:

``````> rcauchy(20, 0, 2)
[1]  2.88322466  1.23568309  6.54969674  1.72653796 -0.63361340 -0.21271572 -0.08195728 -0.18053311  2.48106495
[10] -0.81135430 15.45362119 -0.05190334 -0.31957886  0.42332266  1.15329784 -1.60977675 16.99981285 -0.14050915
[19]  3.56674963 -2.23761791
``````
``````poisson_log_lpmf(y[n] | a[g[n]]+ alpha + x[n]*beta)
``````

Does it make sense to have your parameters on an unconstrained scale and then constrain them with a link function? Like unconstrain a, alpha, and beta, and then do:

``````poisson_log_lpmf(y[n] | exp(a[g[n]]+ alpha + x[n]*beta))
``````

Might be easier to keep things under control that way.

I did try a normal prior and things blow up anyway.

I see you point about constraining the parameters but I actually
don’t understand your use of both the log
with poisson_log_lpmf and the exponential
on the linear predictor…

I thought the log option is already taking care of exponentiating the linear predictor?

but I actually don’t understand your use of both the log

I made a mistake :D. Good catch. Should be poisson_lpmf.

Ah ok.

Yes I tried that too. Replacing poisson_log_lpmf with poisson_lpmf
and exponentiate the linear predictor myself.
In this case I don’t get the exception but the log rate can still be really large
and in the mixture of 3 zero inflated poissons the mixture components blow up
and I get NAs

In this case I don’t get the exception but the log rate can still be really large

Is this with the data in place or with the prior predictive stuff?

mixture of 3 zero inflated Poissons the mixture components blow up

I’m assuming this is one zero term + three Poissons mixed together?

And are you fitting generated data? If not, maybe start there. It’ll help diagnose sampling problems with the model without worrying about how good a fit the model will actually provide to the data.

Also it might be good to post the full mixture model. Maybe there’s something weird with that.

I am still trying to figure out
what is happening when I sample from the prior predictive.
Here is where I get huge values.

I haven’t fitted generated data yet. I should do it. Thanks.

I’m assuming this is one zero term + three Poissons mixed together?

It is actually a mixture of 4 (sorry four not three) zero inflated poissons not a zero inflation of a poisson mixture.
I have 4 latent classes each with a zero inflated poisson outcome.

Here is the code. I hope it’s not too messy.

``````data {
int<lower=1> N;              // sample size
int<lower=1> N_cov;          // number of covariates
int<lower=1> N_type;         // number of types
int<lower=1> J;         // number of groups

int<lower=1,upper=3> z[N];   // treatment assigned
int<lower=0> y[N];   // outcomes
matrix[N,N_cov] x;          // covariates
int g[N];    // group indicator
int run_estimation;

}
parameters {

// OUTCOME MODEL PARAMETERS
real alpha_rc1;  //intercept
real alpha_rc2;  //intercept
real alpha_rc3;  //intercept
real alpha_pc1;  //intercept
real alpha_pc2;  //intercept
real alpha_pc3;  //intercept
real alpha_at1;  //intercept
//  real<lower=alpha_at1> alpha_at2;  //intercept
//  real<lower=alpha_at2> alpha_at3;  //intercept
real alpha_nt1;  //intercept
real alpha_nt2;  //intercept
real alpha_nt3;  //intercept

real<lower=0, upper=1> phi_rc1; //
real<lower=0, upper=1> phi_rc2; //
real<lower=0, upper=1> phi_rc3; //
real<lower=0, upper=1> phi_pc1; //
real<lower=0, upper=1> phi_pc2; //
real<lower=0, upper=1> phi_pc3; //
real<lower=0, upper=1> phi_at1; //
//  real<lower=0, upper=1> phi_at2; //
//  real<lower=0, upper=1> phi_at3; //
real<lower=0, upper=1> phi_nt1; //
real<lower=0, upper=1> phi_nt2; //
real<lower=0, upper=1> phi_nt3; //

vector[N_cov] beta_pc;  //slopes (same for all assignment levels)
vector[N_cov] beta_at;  //slopes (same for all assignment levels)
vector[N_cov] beta_nt;  //slopes (same for all assignment levels)
vector[N_cov] beta_rc;  //slopes (same for all assignment levels)

real a[J];
real<lower=0,upper=10> sigma;

// COMPLIANCE MODEL PARAMETERS
real gamma[N_type-1];                   // intercepts for compliance models
vector[N_cov] delta[N_type-1];      // coefficients for compliance models
}
transformed parameters {

vector[N] mu_rc1;    //the linear predictor
vector[N] mu_rc2;    //the linear predictor
vector[N] mu_rc3;    //the linear predictor
vector[N] mu_pc1;    //the linear predictor
vector[N] mu_pc2;    //the linear predictor
vector[N] mu_pc3;    //the linear predictor
vector[N] mu_at1;    //the linear predictor
vector[N] mu_at2;    //the linear predictor
vector[N] mu_at3;    //the linear predictor
vector[N] mu_nt1;    //the linear predictor
vector[N] mu_nt2;    //the linear predictor
vector[N] mu_nt3;    //the linear predictor

for(n in 1:N){
mu_rc1[n] = exp(a[g[n]]+alpha_rc1+x[n]*beta_rc); //using the log link
mu_rc2[n] = exp(a[g[n]]+alpha_rc2+x[n]*beta_rc); //using the log link
mu_rc3[n] = exp(a[g[n]]+alpha_rc3+x[n]*beta_rc); //using the log link
mu_pc1[n] = exp(a[g[n]]+alpha_pc1+x[n]*beta_pc); //using the log link
mu_pc2[n] = exp(a[g[n]]+alpha_pc2+x[n]*beta_pc); //using the log link
mu_pc3[n] = exp(a[g[n]]+alpha_pc3+x[n]*beta_pc); //using the log link
mu_at1[n] = exp(a[g[n]]+alpha_at1+x[n]*beta_at); //using the log link
mu_at2[n] = exp(a[g[n]]+alpha_at1+x[n]*beta_at); //using the log link
mu_at3[n] = exp(a[g[n]]+alpha_at1+x[n]*beta_at); //using the log link
mu_nt1[n] = exp(a[g[n]]+alpha_nt1+x[n]*beta_nt); //using the log link
mu_nt2[n] = exp(a[g[n]]+alpha_nt2+x[n]*beta_nt); //using the log link
mu_nt3[n] = exp(a[g[n]]+alpha_nt3+x[n]*beta_nt); //using the log link
}

}

model {

// TEMPORARY VARIABLES
vector[N_type] log_pi;
real mix3[3];
real mix2[2];

// PRIORS

alpha_rc1 ~ normal(0, 2);
alpha_rc2 ~ normal(0, 2);
alpha_rc3 ~ normal(0, 2);
alpha_pc1 ~ normal(0, 2);
alpha_pc2 ~ normal(0, 2);
alpha_pc3 ~ normal(0, 2);
alpha_nt1 ~ normal(0, 2);
alpha_nt2 ~ normal(0, 2);
alpha_nt3 ~ normal(0, 2);
alpha_at1 ~ normal(0, 2);

beta_rc ~ normal(0, 2);
beta_pc ~ normal(0, 2);
beta_at ~ normal(0, 2);
beta_nt ~ normal(0, 2);

sigma ~normal(0,sigma);

a ~ normal(0,sigma);

// compliance model
gamma ~ normal(0, 2.5);
for(k in 1:(N_type-1))
delta[k] ~ normal(0, 2.5);

if(run_estimation==1) {
// MODELS FOR OUTCOME
for(n in 1:N){

// Multinomial model (k: 1=nt, 2=rc, 3=pc, 4=at)
for(k in 1:(N_type-1))
log_pi[k] = gamma[k] + x[n]*delta[k];
log_pi[N_type] = 0;

// Always Takers
if(z[n] == 1 && d[n] == 1){
if(y[n] == 0)
target+=log_pi[4] - log_sum_exp(log_pi)+log_sum_exp(bernoulli_lpmf(1| phi_at1),
bernoulli_lpmf(0| phi_at1) + poisson_lpmf(y[n]|mu_at1[n]));
else
target+=log_pi[4] - log_sum_exp(log_pi)+
bernoulli_lpmf(0| phi_at1) + poisson_lpmf(y[n]|mu_at1[n]);
}

// Never Takers
else if(z[n] == 3 && d[n] == 0){
if(y[n] == 0)
target+=log_pi[1] - log_sum_exp(log_pi)+log_sum_exp(bernoulli_lpmf(1| phi_nt3),
bernoulli_lpmf(0| phi_nt3) + poisson_lpmf(y[n]|mu_nt3[n]));
else
target+=log_pi[1] - log_sum_exp(log_pi)+
bernoulli_lpmf(0| phi_nt3) + poisson_lpmf(y[n]|mu_nt3[n]);
}

// R-Complier, P-Complier or Never Taker
else if(z[n] == 1 && d[n] == 0){
if(y[n] == 0){
mix3[1]= log_pi[1] - log_sum_exp(log_pi)
+log_sum_exp(bernoulli_lpmf(1| phi_nt1),
bernoulli_lpmf(0| phi_nt1) + poisson_lpmf(y[n]|mu_nt1[n])); // Never Taker

mix3[2]=log_pi[2] - log_sum_exp(log_pi)
+log_sum_exp(bernoulli_lpmf(1| phi_rc1),
bernoulli_lpmf(0| phi_rc1) + poisson_lpmf(y[n]|mu_rc1[n])); // R-Complier

mix3[3]=log_pi[3] - log_sum_exp(log_pi)
+log_sum_exp(bernoulli_lpmf(1| phi_pc1),
bernoulli_lpmf(0| phi_pc1) + poisson_lpmf(y[n]|mu_pc1[n])); // P-Complier
}
else{
mix3[1]= log_pi[1] - log_sum_exp(log_pi)
+bernoulli_lpmf(0| phi_nt1) + poisson_lpmf(y[n]|mu_nt1[n]); // Never Taker

mix3[2]=log_pi[2] - log_sum_exp(log_pi)
+bernoulli_lpmf(0| phi_rc1) + poisson_lpmf(y[n]|mu_rc1[n]); // R-Complier

mix3[3]=log_pi[3] - log_sum_exp(log_pi)
+bernoulli_lpmf(0| phi_pc1) + poisson_lpmf(y[n]|mu_pc1[n]); // P-Complier
}

target+=  log_sum_exp(mix3);
}

// R-Complier or Never Taker
else if(z[n] == 2 && d[n] == 0){
if(y[n] == 0){
mix2[1]= log_pi[1] - log_sum_exp(log_pi)
+log_sum_exp(bernoulli_lpmf(1| phi_nt2),
bernoulli_lpmf(0| phi_nt2) + poisson_lpmf(y[n]|mu_nt2[n])); // Never Taker

mix2[2]=log_pi[2] - log_sum_exp(log_pi)
+log_sum_exp(bernoulli_lpmf(1| phi_rc2),
bernoulli_lpmf(0| phi_rc2) + poisson_lpmf(y[n]|mu_rc2[n])); // R-Complier
}
else{
mix2[1]= log_pi[1] - log_sum_exp(log_pi)
+bernoulli_lpmf(0| phi_nt2) + poisson_lpmf(y[n]|mu_nt2[n]); // Never Taker

mix2[2]=log_pi[2] - log_sum_exp(log_pi)
+bernoulli_lpmf(0| phi_rc2) + poisson_lpmf(y[n]|mu_rc2[n]); // R-Complier

}

target+=  log_sum_exp(mix2);
}

// P-Complier or Always Taker
else if(z[n] == 2 && d[n] == 1){
if(y[n] == 0){
mix2[1]= log_pi[4] - log_sum_exp(log_pi)
+log_sum_exp(bernoulli_lpmf(1| phi_at1),
bernoulli_lpmf(0| phi_at1) + poisson_lpmf(y[n]|mu_at2[n])); // Never Taker

mix2[2]=log_pi[3] - log_sum_exp(log_pi)
+log_sum_exp(bernoulli_lpmf(1| phi_pc2),
bernoulli_lpmf(0| phi_pc2) + poisson_lpmf(y[n]|mu_pc2[n])); // R-Complier
}
else{
mix2[1]= log_pi[4] - log_sum_exp(log_pi)
+bernoulli_lpmf(0| phi_at1) + poisson_lpmf(y[n]|mu_at2[n]); // Always Taker

mix2[2]=log_pi[3] - log_sum_exp(log_pi)
+bernoulli_lpmf(0| phi_pc2) + poisson_lpmf(y[n]|mu_pc2[n]); // P-Complier

}

target+=  log_sum_exp(mix2);
}

// R-Complier, P-Complier or Always Taker
else if(z[n] == 3 && d[n] == 1){
if(y[n] == 0){
mix3[1]= log_pi[4] - log_sum_exp(log_pi)
+log_sum_exp(bernoulli_lpmf(1| phi_at1),
bernoulli_lpmf(0| phi_at1) + poisson_lpmf(y[n]|mu_at3[n])); // Always Taker

mix3[2]=log_pi[2] - log_sum_exp(log_pi)
+log_sum_exp(bernoulli_lpmf(1| phi_rc3),
bernoulli_lpmf(0| phi_rc3) + poisson_lpmf(y[n]|mu_rc3[n])); // R-Complier

mix3[3]=log_pi[3] - log_sum_exp(log_pi)
+log_sum_exp(bernoulli_lpmf(1| phi_pc3),
bernoulli_lpmf(0| phi_pc3) + poisson_lpmf(y[n]|mu_pc3[n])); // P-Complier
}
else{
mix3[1]= log_pi[4] - log_sum_exp(log_pi)
+bernoulli_lpmf(0| phi_at1) + poisson_lpmf(y[n]|mu_at3[n]); // Never Taker

mix3[2]=log_pi[2] - log_sum_exp(log_pi)
+bernoulli_lpmf(0| phi_rc3) + poisson_lpmf(y[n]|mu_rc3[n]); // R-Complier

mix3[3]=log_pi[3] - log_sum_exp(log_pi)
+bernoulli_lpmf(0| phi_pc3) + poisson_lpmf(y[n]|mu_pc3[n]); // P-Complier
}

target+=  log_sum_exp(mix3);
}

}
}
}

// FOR FINITE SAMPLE INFERENCE
generated quantities{

real at_y1_bar;
real at_y2_bar;
real at_y3_bar;
real nt_y1_bar;
real nt_y2_bar;
real nt_y3_bar;
real rc_y1_bar;
real rc_y2_bar;
real rc_y3_bar;
real pc_y1_bar;
real pc_y2_bar;
real pc_y3_bar;
real y1_bar;
real y2_bar;
real y3_bar;

real at_y1_pos;
real at_y2_pos;
real at_y3_pos;
real nt_y1_pos;
real nt_y2_pos;
real nt_y3_pos;
real rc_y1_pos;
real rc_y2_pos;
real rc_y3_pos;
real pc_y1_pos;
real pc_y2_pos;
real pc_y3_pos;
real y1_pos;
real y2_pos;
real y3_pos;

// Types
vector[N] type_at;
vector[N] type_nt;
vector[N] type_pc;
vector[N] type_rc;

vector[N] ypred;  //for posterior checks

{ // to create temporary variables

// Generate science table
vector[N] y1;
vector[N] y2;
vector[N] y3;

vector[N] ybin1;
vector[N] ybin2;
vector[N] ybin3;

int zero1;
int zero2;
int zero3;

// Temporary variables
vector[N_type] log_pi_tmp;
vector[2] mix2_pi_tmp;
vector[3] mix3_pi_tmp;
int type_tmp;

// Initialize types to zero
for(n in 1:N){
type_at[n] = 0;
type_nt[n] = 0;
type_rc[n] = 0;
type_pc[n] = 0;

}

// Loop through observation
for(n in 1:N){

// Multinomial model (k: 1=nt, 2=rc, 3=pc, 4=at)
for(k in 1:(N_type-1))
log_pi_tmp[k] = gamma[k] + x[n]*delta[k];
log_pi_tmp[N_type] = 0;

// Always Takers
if(z[n] == 1 && d[n] == 1){
type_at[n] = 1;

zero1=bernoulli_rng(phi_at1);
zero2=bernoulli_rng(phi_at1);
zero3=bernoulli_rng(phi_at1);

y1[n] = y[n];
y2[n] = (1-zero2)*poisson_rng(mu_at2[n]);
y3[n] = (1-zero3)*poisson_rng(mu_at3[n]);

ypred[n]=(1-zero1)*poisson_rng(mu_at1[n]);

}

// Never Takers
else if(z[n] == 3 && d[n] == 0){
type_nt[n] = 1;

zero1=bernoulli_rng(phi_nt1);
zero2=bernoulli_rng(phi_nt2);
zero3=bernoulli_rng(phi_nt3);

y1[n] = (1-zero1)*poisson_rng(mu_nt1[n]);
y2[n] = (1-zero2)*poisson_rng(mu_nt2[n]);
y3[n] = y[n];

ypred[n]=(1-zero3)*poisson_rng(mu_nt3[n]);

}

// R-Complier, P-Complier or Never Taker
else if(z[n] == 1 && d[n] == 0){
if(y[n] == 0){
// generate type
mix3_pi_tmp[1] = exp(log_pi_tmp[1] +log_sum_exp(bernoulli_lpmf(1| phi_nt1),
bernoulli_lpmf(0| phi_nt1) + poisson_lpmf(y[n]|mu_nt1[n])));   // Never Taker
mix3_pi_tmp[2] = exp(log_pi_tmp[2] +log_sum_exp(bernoulli_lpmf(1| phi_rc1),
bernoulli_lpmf(0| phi_rc1) + poisson_lpmf(y[n]|mu_rc1[n])));   // R-Complier
mix3_pi_tmp[3] = exp(log_pi_tmp[3] +log_sum_exp(bernoulli_lpmf(1| phi_pc1),
bernoulli_lpmf(0| phi_pc1) + poisson_lpmf(y[n]|mu_pc1[n])));   // P-Complier
}
else{
// generate type
mix3_pi_tmp[1] = exp(log_pi_tmp[1] +
bernoulli_lpmf(0| phi_nt1) + poisson_lpmf(y[n]|mu_nt1[n]));   // Never Taker
mix3_pi_tmp[2] = exp(log_pi_tmp[2] +
bernoulli_lpmf(0| phi_rc1) + poisson_lpmf(y[n]|mu_rc1[n]));   // R-Complier
mix3_pi_tmp[3] = exp(log_pi_tmp[3] +
bernoulli_lpmf(0| phi_pc1) + poisson_lpmf(y[n]|mu_pc1[n]));   // P-Complier

}

mix3_pi_tmp = mix3_pi_tmp/sum(mix3_pi_tmp);
type_tmp = categorical_rng(mix3_pi_tmp);

// generate outcomes
if(type_tmp == 1){ // Never Taker
type_nt[n] = 1;

zero1=bernoulli_rng(phi_nt1);
zero2=bernoulli_rng(phi_nt2);
zero3=bernoulli_rng(phi_nt3);

y1[n] = y[n];
y2[n] = (1-zero2)*poisson_rng(mu_nt2[n]);
y3[n] = (1-zero3)*poisson_rng(mu_nt3[n]);

ypred[n]=(1-zero1)*poisson_rng(mu_nt1[n]);

}

if(type_tmp == 2){ // RComplier
type_rc[n] = 1;

zero1=bernoulli_rng(phi_rc1);
zero2=bernoulli_rng(phi_rc2);
zero3=bernoulli_rng(phi_rc3);

y1[n] = y[n];
y2[n] = (1-zero2)*poisson_rng(mu_rc2[n]);
y3[n] = (1-zero3)*poisson_rng(mu_rc3[n]);

ypred[n]=(1-zero1)*poisson_rng(mu_rc1[n]);

}
if(type_tmp == 3){ // PComplier
type_pc[n] = 1;

zero1=bernoulli_rng(phi_pc1);
zero2=bernoulli_rng(phi_pc2);
zero3=bernoulli_rng(phi_pc3);

y1[n] = y[n];
y2[n] = (1-zero2)*poisson_rng(mu_pc2[n]);
y3[n] = (1-zero3)*poisson_rng(mu_pc3[n]);

ypred[n]=(1-zero1)*poisson_rng(mu_pc1[n]);

}

}

// R-Complier or Never Taker
else if(z[n] == 2 && d[n] == 0){
if(y[n] == 0){
// generate type
mix2_pi_tmp[1] = exp(log_pi_tmp[1] +log_sum_exp(bernoulli_lpmf(1| phi_nt2),
bernoulli_lpmf(0| phi_nt2) + poisson_lpmf(y[n]|mu_nt2[n])));   // Never Taker
mix2_pi_tmp[2] = exp(log_pi_tmp[2] +log_sum_exp(bernoulli_lpmf(1| phi_rc2),
bernoulli_lpmf(0| phi_rc2) + poisson_lpmf(y[n]|mu_rc2[n])));   // R-Complier
}
else{
// generate type
mix2_pi_tmp[1] = exp(log_pi_tmp[1] +
bernoulli_lpmf(0| phi_nt2) + poisson_lpmf(y[n]|mu_nt2[n]));   // Never Taker
mix2_pi_tmp[2] = exp(log_pi_tmp[2] +
bernoulli_lpmf(0| phi_rc2) + poisson_lpmf(y[n]|mu_rc2[n]));   // R-Complier

}
mix2_pi_tmp = mix2_pi_tmp/sum(mix2_pi_tmp);
type_tmp = categorical_rng(mix2_pi_tmp);

// generate outcomes
if(type_tmp == 1){ // Never Taker
type_nt[n] = 1;

zero1=bernoulli_rng(phi_nt1);
zero2=bernoulli_rng(phi_nt2);
zero3=bernoulli_rng(phi_nt3);

y1[n] = (1-zero1)*poisson_rng(mu_nt1[n]);
y2[n] = y[n];
y3[n] = (1-zero3)*poisson_rng(mu_nt3[n]);

ypred[n]=(1-zero2)*poisson_rng(mu_nt2[n]);

}

if(type_tmp == 2){ // RComplier
type_rc[n] = 1;

zero1=bernoulli_rng(phi_rc1);
zero2=bernoulli_rng(phi_rc2);
zero3=bernoulli_rng(phi_rc3);

y1[n] = (1-zero1)*poisson_rng(mu_rc1[n]);
y2[n] = y[n];
y3[n] = (1-zero3)*poisson_rng(mu_rc3[n]);

ypred[n]=(1-zero2)*poisson_rng(mu_rc2[n]);

}

}

// P-Complier or Always Taker
else if(z[n] == 2 && d[n] == 1){
if(y[n] == 0){
// generate type
mix2_pi_tmp[1] = exp(log_pi_tmp[4] +log_sum_exp(bernoulli_lpmf(1| phi_at1),
bernoulli_lpmf(0| phi_at1) + poisson_lpmf(y[n]|mu_at2[n])));   // Always Taker
mix2_pi_tmp[2] = exp(log_pi_tmp[3] +log_sum_exp(bernoulli_lpmf(1| phi_pc2),
bernoulli_lpmf(0| phi_pc2) + poisson_lpmf(y[n]|mu_pc2[n])));   // P-Complier
}
else{
// generate type
mix2_pi_tmp[1] = exp(log_pi_tmp[4] +
bernoulli_lpmf(0| phi_at1) + poisson_lpmf(y[n]|mu_at2[n]));   // Always Taker
mix2_pi_tmp[2] = exp(log_pi_tmp[3] +
bernoulli_lpmf(0| phi_pc2) + poisson_lpmf(y[n]|mu_pc2[n]));   // P-Complier
}
mix2_pi_tmp = mix2_pi_tmp/sum(mix2_pi_tmp);
type_tmp = categorical_rng(mix2_pi_tmp);

// generate outcomes
if(type_tmp == 1){ // Always Taker
type_at[n] = 1;

zero1=bernoulli_rng(phi_at1);
zero2=bernoulli_rng(phi_at1);
zero3=bernoulli_rng(phi_at1);

y1[n] = (1-zero1)*poisson_rng(mu_at1[n]);
y2[n] = y[n];
y3[n] = (1-zero3)*poisson_rng(mu_at3[n]);

ypred[n]=(1-zero2)*poisson_rng(mu_at2[n]);

}

if(type_tmp == 2){ // P-Complier
type_pc[n] = 1;

zero1=bernoulli_rng(phi_pc1);
zero2=bernoulli_rng(phi_pc2);
zero3=bernoulli_rng(phi_pc3);

y1[n] = (1-zero1)*poisson_rng(mu_pc1[n]);
y2[n] = y[n];
y3[n] = (1-zero3)*poisson_rng(mu_pc3[n]);

ypred[n]=(1-zero2)*poisson_rng(mu_pc2[n]);

}

}

// R-Complier, P-Complier or Always Taker
else if(z[n] == 3 && d[n] == 1){
if(y[n] == 0){
// generate type
mix3_pi_tmp[1] = exp(log_pi_tmp[4] +log_sum_exp(bernoulli_lpmf(1| phi_at1),
bernoulli_lpmf(0| phi_at1) + poisson_lpmf(y[n]|mu_at3[n])));    // Always Taker
mix3_pi_tmp[2] = exp(log_pi_tmp[2] +log_sum_exp(bernoulli_lpmf(1| phi_rc3),
bernoulli_lpmf(0| phi_rc3) + poisson_lpmf(y[n]|mu_rc3[n])));    // R-Complier
mix3_pi_tmp[3] = exp(log_pi_tmp[3] +log_sum_exp(bernoulli_lpmf(1| phi_pc3),
bernoulli_lpmf(0| phi_pc3) + poisson_lpmf(y[n]|mu_pc3[n])));    // P-Complier
}
else{
// generate type
mix3_pi_tmp[1] = exp(log_pi_tmp[4] +
bernoulli_lpmf(0| phi_at1) + poisson_lpmf(y[n]|mu_at3[n]));    // Always Taker
mix3_pi_tmp[2] = exp(log_pi_tmp[2] +
bernoulli_lpmf(0| phi_rc3) + poisson_lpmf(y[n]|mu_rc3[n]));    // R-Complier
mix3_pi_tmp[3] = exp(log_pi_tmp[3] +
bernoulli_lpmf(0| phi_pc3) + poisson_lpmf(y[n]|mu_pc3[n]));    // P-Complier

}
mix3_pi_tmp = mix3_pi_tmp/sum(mix3_pi_tmp);
type_tmp = categorical_rng(mix3_pi_tmp);

// generate outcomes
if(type_tmp == 1){ // Never Taker
type_at[n] = 1;

zero1=bernoulli_rng(phi_at1);
zero2=bernoulli_rng(phi_at1);
zero3=bernoulli_rng(phi_at1);

y1[n] = (1-zero1)*poisson_rng(mu_at1[n]);
y2[n] = (1-zero2)*poisson_rng(mu_at2[n]);
y3[n] = y[n];

ypred[n]=(1-zero3)*poisson_rng(mu_at3[n]);

}

if(type_tmp == 2){ // RComplier
type_rc[n] = 1;

zero1=bernoulli_rng(phi_rc1);
zero2=bernoulli_rng(phi_rc2);
zero3=bernoulli_rng(phi_rc3);

y1[n] = (1-zero1)*poisson_rng(mu_rc1[n]);
y2[n] = (1-zero2)*poisson_rng(mu_rc2[n]);
y3[n] = y[n];

ypred[n]=(1-zero3)*poisson_rng(mu_rc3[n]);

}
if(type_tmp == 3){ // PComplier
type_pc[n] = 1;

zero1=bernoulli_rng(phi_pc1);
zero2=bernoulli_rng(phi_pc2);
zero3=bernoulli_rng(phi_pc3);

y1[n] = (1-zero1)*poisson_rng(mu_pc1[n]);
y2[n] = (1-zero2)*poisson_rng(mu_pc2[n]);
y3[n] = y[n];

ypred[n]=(1-zero3)*poisson_rng(mu_pc3[n]);

}

}

} // end for loop

for(n in 1:N){
ybin1[n]=y1[n]>0;
ybin2[n]=y2[n]>0;
ybin3[n]=y3[n]>0;
}

// store mean values
y1_bar = mean(y1);
y2_bar = mean(y2);
y3_bar = mean(y3);

y1_pos = mean(ybin1);
y2_pos = mean(ybin2);
y3_pos = mean(ybin3);

at_y1_bar = sum(type_at .* y1)/sum(type_at);
at_y2_bar = sum(type_at .* y2)/sum(type_at);
at_y3_bar = sum(type_at .* y3)/sum(type_at);

at_y1_pos = sum(type_at .* ybin1)/sum(type_at);
at_y2_pos = sum(type_at .* ybin2)/sum(type_at);
at_y3_pos = sum(type_at .* ybin3)/sum(type_at);

nt_y1_bar = sum(type_nt .* y1)/sum(type_nt);
nt_y2_bar = sum(type_nt .* y2)/sum(type_nt);
nt_y3_bar = sum(type_nt .* y3)/sum(type_nt);

nt_y1_pos = sum(type_nt .* ybin1)/sum(type_nt);
nt_y2_pos = sum(type_nt .* ybin2)/sum(type_nt);
nt_y3_pos = sum(type_nt .* ybin3)/sum(type_nt);

if(sum(type_rc)!=0){
rc_y1_bar = sum(type_rc .* y1)/sum(type_rc);
rc_y2_bar = sum(type_rc .* y2)/sum(type_rc);
rc_y3_bar = sum(type_rc .* y3)/sum(type_rc);

rc_y1_pos = sum(type_rc .* ybin1)/sum(type_rc);
rc_y2_pos = sum(type_rc .* ybin2)/sum(type_rc);
rc_y3_pos = sum(type_rc .* ybin3)/sum(type_rc);

}
else {
rc_y1_bar = 99;
rc_y2_bar = 99;
rc_y3_bar = 99;

rc_y1_pos = 99;
rc_y2_pos = 99;
rc_y3_pos = 99;

}

if(sum(type_pc)!=0){
pc_y1_bar = sum(type_pc .* y1)/sum(type_pc);
pc_y2_bar = sum(type_pc .* y2)/sum(type_pc);
pc_y3_bar = sum(type_pc .* y3)/sum(type_pc);

pc_y1_pos = sum(type_pc .* ybin1)/sum(type_pc);
pc_y2_pos = sum(type_pc .* ybin2)/sum(type_pc);
pc_y3_pos = sum(type_pc .* ybin3)/sum(type_pc);

}
else {
pc_y1_bar = 99;
pc_y2_bar = 99;
pc_y3_bar = 99;

pc_y1_pos = 99;
pc_y2_pos = 99;
pc_y3_pos = 99;
}

} // end temporary variables
}
``````

Here N_type is 4, the 4 latent classes (compliance groups).
We have intercepts alpha that differ across groups (rc, pc, at, nt)
and treatment assignments. (1,2,3),
whereas the beta coefficients for the variables in matrix x are constant
across treatment assignments.

Wow this is a big model! Lemme ask some questions then we’ll probably need to simplify this. There’s gonna be a way to write this out with less manual variable declaration too. Coding in these things manually is gonna be super error prone.

1. Where are the mixtures coming in? It looks like you have a bunch of different grouping indicators. g, z, and d. Do you consider these to be unreliable and you want to try to figure out what groups people actually might be in? Or is there something else?

It feels almost like there are two models happening at once in here, like `mu_rc1[n] = exp(a[g[n]]+alpha_rc1+x[n]*beta_rc); //using the log link ...` is predicting the rates of a zero-inflated Poisson distribution as a function of known groups, and

`log_pi[k] = gamma[k] + x[n]*delta[k];` is trying to predict group membership as a function of covariates

It seems like either of those could be a model on its own, or is there some other thing linking them together?

1. For the ifs on z and d, is the situation that you know certain people can’t be in certain groups?

2. Is it one result per person?

3. Doing softmaxes like `log_pi[4] - log_sum_exp(log_pi)` is gonna lead to overflows/underflows. Use the `softmax` function and if you need things back on a log scale take the log after.

Would a mixture of Negative-Binomials be okay? I guess a mixture of mixtures (ZIPs) will be hard to fit.

You want to use loops and arrays because they’re easier to get right than long cut-and-paste lists.

For this:

I think you meant `_at2` and `_at3` in the second and third lines on the RHS. That’s an example of the kind of cut-and-paste error this kind of coding will lead to.

I’d also suggest starting with something much simpler to begin. That’s a huge Stan program, even after you get rid of all the redundancy.

Thank you Bob for checkin that out.
That wasn’t a mistake. alpha_at should be the same for all three levels by assumption.
This is more a conceptual issue than a code issue.

I have started with simpler code actually. I have started building this code using
zero inflated poisson without mixture and separately mixture model with normal outcome and
then I combined the two.

Max is right. a mixture of mixtures is hard to fit.
I will try with negative-binomials but ZIP would fit the data better.

Sorry for my really late reply.
But thank you for going through my code.

1. Not really. g is a group indicator that I use for random effects.
We then have 4 strata (at, nt, pc, rc) that are partially defined by
z and d. I don’t know the membership to these strata.
Depending on the strata membership the model for Y has different parameters.

2. For some combinations of z and d I know the strata membership. For others
unit can be a member of two strata and we don’t know which one.

3. What do you mean? Each person belongs to a group g and a stratum at,nt,pc or rc. Based on these
memberships and the covariates in x of each person, I impute the 3 values of Y corresponding
to the three values of the treatment z.

4. I am interested in this point. Could you please be more specific?

re 1. If you can do what Max said and avoid the mixtures, excellent

re 3. It’s been awhile and I’ve forgotten what I meant I guess :(

re 4. If you’re dealing with ratios of exponentials, you can often multiply the top and the bottom by a constant e^c to avoid computing the exponential of large numbers.

For instance maybe you want to compute this:

e^(50) / (e^(50) + e^(51))

e^51 is a large number, we can agree, but if you multiply through by e^(-50) on the top and the bottom, you only need to compute:

1.0 / (1.0 + e)

These terms: log_pi[4] - log_sum_exp(log_pi) are basically logs of softmaxes, I think. The usual way to scale softmaxes (of a vector v) is to multiply the numerator and denominator by something like e^(-max(v)). This Q: https://stackoverflow.com/q/34968722 and A: https://stackoverflow.com/a/34969389 are relevant.