 # Beta_binomial fitting issues

Hi all,

I am having issues using beta_binomial. I started with trying to fit m12.1 in Statistical Rethinking (beta_binomial fit to classic UC Berkeley admissions data). The ulam generated stancode follows.

``````data{
int N;
int A;
int gid;
}
parameters{
vector a;
real<lower=0> phi;
}
transformed parameters{
real theta;
theta = phi + 2;
}
model{
vector pbar;
phi ~ exponential( 1 );
a ~ normal( 0 , 1.5 );
for ( i in 1:12 ) {
pbar[i] = a[gid[i]];
pbar[i] = inv_logit(pbar[i]);
}
A ~ beta_binomial( N , pbar*theta , (1-pbar)*theta );
}

``````

Warnings such as maximum treedepth, large R-hat, low ESS etc. are seen. I tried adjusting the priors with no luck.

Separating the beta_binomial into hierarchical pbar ~ beta(alpha,beta) and admits ~ binomial(apps, pbar) worked (code below). Moreover, the posterior samples (of a, phi, theta) had very similar summary statistics to the ones in Statistical Rethinking (R code 12.3 on pg. 372).

``````data{
int N;
int apps[N];
int gid[N];
}

parameters{
vector a;
real<lower=0> phi;
vector<lower=0, upper=1>[N] pbar;

}

transformed parameters{
vector[N] mu;
vector[N] alpha;
vector[N] beta;
real<lower=2> theta;

theta = phi + 2;
for (i in 1:N){
mu[i] = inv_logit(a[gid[i]]);
}

alpha = mu * theta;
beta = (1-mu) * theta;

}

model{
pbar ~ beta(alpha, beta);

a ~ normal(0, 1.5);
phi ~ exponential(1);

}

``````

Should the separation into beta then binomial not be equivalent to using beta_binomial in one step? (some naming and other slight changes to practice my stan coding…). I also tried more or less exactly the above code but recombining into beta_binomial, which makes the beforeseen errors resurface.

``````data{
int N;
int apps[N];
int gid[N];
}

parameters{
vector a;
real<lower=0> phi;
vector<lower=0, upper=1>[N] pbar;

}

transformed parameters{
vector[N] mu;
vector[N] alpha;
vector[N] beta;
real<lower=2> theta;

theta = phi + 2;
for (i in 1:N){
mu[i] = inv_logit(a[gid[i]]);
}

alpha = mu * theta;
beta = (1-mu) * theta;

}

model{

a ~ normal(0, 1.5);
phi ~ exponential(1);

}

``````

Would appreciate any illumination into how to use beta_binomial better, or if my assumption of the equivalent model specifications is wrong. Thank you!

On the last model, if you’ve integrated out the betas again you don’t need `pbar` to be defined as a parameter. See if that helps. I agree it is weird the first model didn’t work, but I am curious to know if `pbar` is the problem with the third model first.

Ah yes I think that was an artifact of starting the second stan file via copy paste… just ran the following and the same issues persist

``````data{
int N;
int apps[N];
int gid[N];
}

parameters{
vector a;
real<lower=0> phi;

}

transformed parameters{
vector[N] mu;
vector[N] alpha;
vector[N] beta;
real<lower=2> theta;

theta = phi + 2;
for (i in 1:N){
mu[i] = inv_logit(a[gid[i]]);
}

alpha = mu * theta;
beta = (1-mu) * theta;

}

model{

a ~ normal(0, 1.5);
phi ~ exponential(1);

}
``````

Can you post the data for this here or send me a link? This is strange. It seems like since this was the published example model it would work well and usually integrating out variables is good.

The .csv is:

“A”,“male”,512,313,825
“A”,“female”,89,19,108
“B”,“male”,353,207,560
“B”,“female”,17,8,25
“C”,“male”,120,205,325
“C”,“female”,202,391,593
“D”,“male”,138,279,417
“D”,“female”,131,244,375
“E”,“male”,53,138,191
“E”,“female”,94,299,393
“F”,“male”,22,351,373
“F”,“female”,24,317,341

Using the following R code to fit:

library(rstan)
options(mc.cores = parallel::detectCores())
d\$gid <- ifelse( d\$applicant.gender==“male” , 1L , 2L )
fit = stan(‘beta_bin2.stan’, data=data_list, chains=4, iter=1000)

1 Like

There must be a problem with the beta_binomial in the Math library in the current Rstan (@bgoodri, so you’re aware). I get lots of treedepth exceeded warnings in 2.21.2.

I wasn’t able to find record of this issue anywhere, but in cmdstanr (which uses a later version of the Math library) this model runs fine. Can you use `cmdstanr` for now: https://mc-stan.org/cmdstanr/reference/index.html ?

The interface is a bit different:

``````library(cmdstanr)
model = cmdstan_model("beta_bin2.stan")
fit = model\$sample(data = data_list, num_chains = 4, parallel_chains = 4)
``````

great! thanks for looking into this. I will definitely try out cmdstanr

1 Like