I am building a non-linear model with the beta_proportion
distribution and am having some difficulty specifying partial pooling/hierarchical structure.
The unpooled model fits okay, but is lacking some flexibility and not capturing some of the observations.
I have specified that one as:
data {
int<lower=1> N; // total number of observations
vector[N] prop; // response variable
int<lower=1> N_Times; // Number of times in time series
matrix[N, N_Times] X1;
matrix[N, N_Times] X2;
vector[N] X3;
int<lower = 0, upper = 1> run_estimation; // evaluate the likelihood?
}
parameters {
real<lower=0> a;
real<lower=0> kappa; // precision
}
model {
vector[N] mu;
for (n in 1:N) {
mu[n] = (sum(row(X1, n) .* (row(X2, n))) * a) / X3[n];
}
if(run_estimation==1){ // Evaluate likelihood
prop ~ beta_proportion(mu, kappa);
}
// priors
a ~ normal(5,3);
}
generated quantities {
vector[N] prop_pred;
row_vector[N] mu_pred;
for (n in 1:N) {
mu_pred[n] = sum(row(X1, n) .* (row(X2, n)) * a) / X3[n];
}
for (n in 1:N) {
if(mu_pred[n] > 1) prop_pred[n] = 1; // An exception for prior predictive checks if response is out of range
else if(mu_pred[n] < 0) prop_pred[n] = 0;
else
prop_pred[n] = beta_proportion_rng(mu_pred[n], kappa);
}
}
A prior predictive check on this model looks reasonable with most of the mass below 0.10.
I tried to create a partial-pooling model as follows:
data {
int<lower=1> N; // total number of observations
vector[N] prop; // Observed outcome
int<lower=1> N_Grp; // Number ofgroups
int<lower = 0, upper = N_Grp> Grp[N]; // group index
int<lower=1> N_Times; // Number of times in time series
matrix[N, N_Times] X1;
matrix[N, N_Times] X2; //
vector[N] X3;
int<lower = 0, upper = 1> run_estimation; // evaluate the likelihood?
}
parameters {
vector<lower = 0, upper = 1>[N_Grp] alpha; //Vector of group intercepts
real<lower=1> a;
real<lower=0> kappa_Grp; // precision for each group
real<lower=0> kappa; // Population-level precision
real mu_Grp; // Group-level location parameter
}
model {
// conditional location parameter
vector[N] mu;
// linear combination
for (n in 1:N) {
mu[n] = alpha[Grp[n]] + (sum(row(X1, n) .* row(X2, n)) / X3[n]) * a;
}
// priors
kappa_Grp ~ gamma(5,5);
kappa ~ gamma(5,5);
a ~ normal(5,3);
// hyper-priors
mu_Grp ~ normal(0, 1);
// level-2 likelihood
alpha ~ beta_proportion(mu_Grp, kappa_Grp);
if(run_estimation==1){ // Evaluate likelihood
// level-1 likelihood
prop ~ beta_proportion(mu, kappa);
}
}
generated quantities {
vector[N] prop_pred;
row_vector[N] mu_pred;
for (n in 1:N) {
mu_pred[n] = alpha[Grp[n]] + (sum(row(X1, n) .* row(X2, n)) / X3[n]) * a;
}
for (n in 1:N) {
if(mu_pred[n] > 1) prop_pred[n] = 1; // An exception for prior predictive checks if response is out of range
else if(mu_pred[n] < 0) prop_pred[n] = 0;
else
prop_pred[n] = beta_proportion_rng(mu_pred[n], kappa);
}
}
But, a prior predictive check on this model creates an undesired U shape with most of the mass on values near zero and values near 1.
Have I specified the structure incorrectly? Or is there something I can do to re-gain the desired expectation of predictions?