 # Trying to use partial pooling in a Beta model

I am building a non-linear model with the `beta_proportion` distribution and am having some difficulty specifying partial pooling/hierarchical structure.

The unpooled model fits okay, but is lacking some flexibility and not capturing some of the observations.

I have specified that one as:

``````data {
int<lower=1> N;  // total number of observations
vector[N] prop;  // response variable
int<lower=1> N_Times; // Number of times in time series
matrix[N, N_Times] X1;
matrix[N, N_Times] X2;
vector[N] X3;
int<lower = 0, upper = 1> run_estimation; // evaluate the likelihood?
}
parameters {
real<lower=0> a;
real<lower=0> kappa;  // precision
}
model {
vector[N] mu;
for (n in 1:N) {
mu[n] = (sum(row(X1, n) .* (row(X2, n))) * a)  / X3[n];
}
if(run_estimation==1){ // Evaluate likelihood
prop ~ beta_proportion(mu, kappa);
}
// priors
a ~ normal(5,3);
}
generated quantities {
vector[N] prop_pred;
row_vector[N] mu_pred;
for (n in 1:N) {
mu_pred[n] = sum(row(X1, n) .* (row(X2, n)) * a) / X3[n];
}
for (n in 1:N) {
if(mu_pred[n] > 1) prop_pred[n] = 1; // An exception for prior predictive checks if response is out of range
else if(mu_pred[n] < 0) prop_pred[n] = 0;
else
prop_pred[n] = beta_proportion_rng(mu_pred[n], kappa);
}
}

``````

A prior predictive check on this model looks reasonable with most of the mass below 0.10.

I tried to create a partial-pooling model as follows:

``````data {
int<lower=1> N;  // total number of observations
vector[N] prop;  // Observed outcome
int<lower=1> N_Grp;  // Number ofgroups
int<lower = 0, upper = N_Grp> Grp[N]; // group index
int<lower=1> N_Times; // Number of times in time series
matrix[N, N_Times] X1;
matrix[N, N_Times] X2; //
vector[N] X3;
int<lower = 0, upper = 1> run_estimation; // evaluate the likelihood?
}
parameters {
vector<lower = 0, upper = 1>[N_Grp] alpha; //Vector of group intercepts
real<lower=1> a;
real<lower=0> kappa_Grp;  // precision for each group
real<lower=0> kappa;  // Population-level precision
real mu_Grp; // Group-level location parameter
}
model {
// conditional location parameter
vector[N] mu;
// linear combination
for (n in 1:N) {
mu[n] = alpha[Grp[n]] + (sum(row(X1, n) .* row(X2, n))  / X3[n]) * a;
}
// priors
kappa_Grp ~ gamma(5,5);
kappa ~ gamma(5,5);
a ~ normal(5,3);
// hyper-priors
mu_Grp ~ normal(0, 1);
// level-2 likelihood
alpha ~ beta_proportion(mu_Grp, kappa_Grp);
if(run_estimation==1){ // Evaluate likelihood
// level-1 likelihood
prop ~ beta_proportion(mu, kappa);
}
}
generated quantities {
vector[N] prop_pred;
row_vector[N] mu_pred;
for (n in 1:N) {
mu_pred[n] = alpha[Grp[n]] + (sum(row(X1, n) .* row(X2, n))  / X3[n]) * a;
}
for (n in 1:N) {
if(mu_pred[n] > 1) prop_pred[n] = 1; // An exception for prior predictive checks if response is out of range
else if(mu_pred[n] < 0) prop_pred[n] = 0;
else
prop_pred[n] = beta_proportion_rng(mu_pred[n], kappa);
}
}

``````

But, a prior predictive check on this model creates an undesired U shape with most of the mass on values near zero and values near 1.

Have I specified the structure incorrectly? Or is there something I can do to re-gain the desired expectation of predictions?

1 Like

Another attempt, that maybe improves things a little, although it runs much slower.

``````data {
int<lower=1> N;  // total number of observations
vector[N] prop;  // Observed outcome
int<lower=1> N_Grp;  // Number of groups
int<lower = 0, upper = N_Grp> Grp[N]; // Group index
int<lower=1> N_Times; // Number of times in time series
matrix[N, N_Times] X1;
matrix[N, N_Times] X2; //
vector[N] X3;
int<lower = 0, upper = 1> run_estimation; // evaluate the likelihood?
}
parameters {
real alpha;
real<lower=1> a;
real<lower=0> kappa;  // Population-level precision
real<lower=0> tau_Grp;  // sd for each group
vector[N_Grp] mu_Grp; // Group-level location parameter
}
model {
// priors
alpha ~ normal(0, 0.1);
a ~ normal(5,3);
kappa ~ gamma(5,5);
tau_Grp ~ normal(0, 0.1);
mu_Grp~ normal(0, tau_prov);
if(run_estimation==1){ // Evaluate likelihood
for (n in 1:N) {
prop[n] ~ beta_proportion(alpha + mu_Grp[Grp] + ((sum(row(X1, n) .* row(TX2, n))  /X3[n]) * a), kappa);
}
}
}
generated quantities {
real prop_pred[N];
for (n in 1:N) {
prop_pred = beta_proportion_rng(alpha + mu_Grp[Grp] + ((sum(row(X1, n) .* row(X2, n))  / X3[n]) * a), kappa);
}
}
``````

Does this seem like more reasonable approach to varying intercepts?

1 Like

Hi,
the standard (which does not mean necessarily better) way to code this model would be to keep all your predictors, including varying intercepts, unconstrained (I.e without bounds) and only after you add all those unconstrained terms, transform them via `inv_logit` to a value that lies between 0 and 1 and can be used as a mean for the `beta_,proportion` distribution. This obviously changes the interpretation of the model coefficients. Would such a transformation make sense for your use case?

Best of luck with your model!

Thank you, @martinmodrak!

I made some changes and moved to a non-centered parameterization, which helped a lot.

``````data {
int<lower=1> N;  // total number of observations
vector[N] y;  // response variable
int<lower=1> N_Grp;  // number of groups
int<lower=1> Grp[N];  // Group indicator
int<lower=1> N_Times; // Number of times in time series
matrix[N, N_Times] X1;
matrix[N, N_Times] X2;
vector[N] X3;
}
parameters {
real beta;  // population-level effect
real<lower=0> kappa;  // precision parameter
vector<lower=0> sd_grp;  // Group standard deviation
vector[N_Grp] mu_grp;  // standardized group means
}
transformed parameters {
vector[N_Grp] grp_int;  //
grp_int = (sd_grp * (mu_grp)); // Group intercepts
}
model {
vector[N] mu;
for (n in 1:N) {
mu[n] = grp_int[Grp[n]] + ((sum(row(X1, n) .* row(X2, n))  / X3[n]) * beta);
}
// likelihood
y ~ beta(inv_logit(mu) * kappa, (1 - inv_logit(mu)) * kappa);
// priors
beta ~ normal(10,10);
kappa ~ gamma(0.01, 0.01);
sd_grp ~ student_t(3, 0, 2.5);
mu_grp ~ std_normal();
}
generated quantities {
vector[N] log_lik;
vector[N] y_pred;
vector[N] mu_pred;
for (n in 1:N) {
mu_pred[n] = grp_int[Grp[n]] + ((sum(row(X1, n) .* row(X2, n))  / X3[n]) * beta);
}
for (n in 1:N) {
y_pred[n] = beta_rng(inv_logit(mu_pred[n]) * kappa, (1 - inv_logit(mu_pred[n])) * kappa);
log_lik[n] = beta_lpdf(y_pred[n] | inv_logit(mu_pred[n]) * kappa, (1 - inv_logit(mu_pred[n])) * kappa);
}
}

``````
1 Like