I am modelling the number of successes in two consecutive trials of size N_1 and N_2, respectively, using the following model:
n_i \mid \lambda_i, N_i \sim Bin(N_i, \lambda_i), for i=1,2.
This treats the number of successes as independent but I expect the success proportions to be strongly positively correlated between the trials. One way to introduce correlation is to use a hierarchical model. I.e., by letting
\lambda_i \mid a,b \sim Beta(a,b)
and then specifying some hyperpriors. I specify
a ~ gamma(0.1,0.1);
b ~ gamma(5,5);
I ran some simulations, sampling (a,b) and then sampling (\lambda_1, \lambda_2) for each (a,b) and found that
cor(\lambda_1, \lambda_2) \approx 0.8 using the above hyperpriors.
I ran the above model and found that cor(\lambda_1, \lambda_2) = 0.0380693 using the posterior estimates of (\lambda_1, \lambda_2). The Stan code is included below. Should I expect the correlation from the posterior estimates of (\lambda_1, \lambda_2) to be around 0.8?
data {
int N_1;
int N_2;
int n_1;
int n_2;
int M;
}
parameters{
real<lower=0, upper=1> lambda[M];
real<lower=0> a;
real<lower=0> b;
}
model{
target+= binomial_lpmf(n_1| N_1, lambda[1]) + binomial_lpmf(n_2| N_2, lambda[2]);
//priors
lambda ~ beta(a,b);
a ~ gamma(0.1,0.1);
b ~ gamma(5,5);
}
Since this model did not appear to work I specified the following priors:
\bigg(log(\lambda_1/(1-\lambda_1)), log(\lambda_2/(1-\lambda_2))\bigg) \sim N((\mu,\mu), \Sigma),
where \Sigma = \begin{pmatrix} \sigma^2 & \sigma^2 \rho \\ \sigma^2 \rho & \sigma^2\end{pmatrix}
with suitable hyperpriors
mu ~ normal(2,1);
sigma ~ normal(0.75,0.25);
rho ~ normal(0.85,0.05);
This model captures everything I want in the context of the problem.
This model gives divergent transitions on the real data and simulated data. I then tried inputting \mu and \sigma as known and only estimating \rho but this also gave divergent transitions.
I wonder if there is a problem with my code or if it is because I have a hierarchical model with only two groups (one proportion for each group) and the model is struggling to identify anything. I have included the code below for the second model.
data {
int N_1;
int N_2;
int n_1;
int n_2;
int M;
//real mu;
//real<lower = 0> sigma;
}
parameters{
vector[M] lambda_raw;
real mu;
real<lower=0> sigma;
real<lower=0, upper=1> rho;
}
transformed parameters{
vector[M] lambda;
lambda[1] = 1/(1+exp(-lambda_raw[1]));
lambda[2] = 1/(1+exp(-lambda_raw[2]));
}
model{
matrix[2,2] Sigma;
Sigma[1,1] = sigma^2;
Sigma[2,2] = sigma^2;
Sigma[1,2] = rho*sigma^2;
Sigma[2,1] = rho*sigma^2;
row_vector[2] Mu;
Mu[1] = mu;
Mu[2] = mu;
target+= binomial_lpmf(n_1| N_1, lambda[1]) + binomial_lpmf(n_2| N_2, lambda[2]);
//priors
lambda_raw ~ multi_normal(Mu,Sigma);
mu ~ normal(2,1);
sigma ~ normal(0.75,0.25);
rho ~ normal(0.85,0.05);
}
Does anyone have any ideas on:
(1) if I should expect the correlation from the posterior estimates of (\lambda_1, \lambda_2) of the first model to be around 0.8?
(2) why the second model gives divergent transitions and if I can hope for it to work?
(3) Another way to model correlated proportions?