I’m modeling a setting where critics assign scores to films. There are three critics and N films, and each critic assigns a score in [0, 1] to each film. Film n has true score U^{(n)}, and critic i has idiosyncrasy \nu_i. My model is R_i^{(n)} = U^{(n)} + \nu_i + \epsilon, where \epsilon is noise. I’m currently simulating data based on this model and trying to recover the ground truth distributions for my parameters.
My Stan model is:
data {
int<lower=0> N;
vector[N] R_0;
vector[N] R_1;
vector[N] R_2;
}
parameters {
vector<lower=0, upper=1>[N] U;
real nu_0;
real nu_1;
real nu_2;
}
model {
// Specify priors
for (n in 1:N) {
U[n] ~ uniform(0, 1);
}
nu_0 ~ normal(0, 1);
nu_1 ~ normal(0, 1);
nu_2 ~ normal(0, 1);
// Specify model
for (n in 1:N) {
R_0[n] ~ normal(U[n] + nu_0, 1e-6);
R_1[n] ~ normal(U[n] + nu_1, 1e-6);
R_2[n] ~ normal(U[n] + nu_2, 1e-6);
}
}
and the Python code for generating the simulated data is:
n = 1000
nu_means = [-0.2, 0, 0.2]
data = {'N': n}
u = rng.uniform(0, 1, n)
for critic_idx, nu_mean in enumerate(nu_means):
nu = rng.normal(nu_mean, 0.1, n)
epsilon = rng.normal(0, 1e-6, n)
data[f'R_{critic_idx}'] = u + nu + epsilon
The posterior means of the \nu variables appear to match the ground truth, but Rhat is greater than 3 for every parameter. I’ve tried adjusting iter, max_treedepth, and adapt_delta based on recommendations given in the output, but I’m not sure what to do next.