Modelling test scores

Hi all, firstly thank you all for your hard work on Stan and your contribution to Bayesian inference. I’m just getting started myself, and I am trying to understand the constraints of what I can and can’t do.

Suppose we have a situation where a school gives one student ten different tests, each on completely different subjects. Each test gets a percentage score between 0.0 and 1.0. The tests are known to be very hard, so we expect the score distribution across all of these tests to be right-tailed.

In taking obscene liberties following this guide https://vioshyvo.github.io/Bayesian_inference/hierarchical-models.html, I wanted to see if I can simulate the Bayesian posterior by treating each test’s variance as unknown, but with an uninformative half-Cauchy distribution that (under my understanding) basically says “the score will move, but not much”.

My goal is to generate Bayesian MCMC draws from those modeling parameters and for each draw be able to know the simulated test results for test1, test2, test3, etc (i.e. generate a vector of scores that index 1 scores correspond to test 1, idx 2 -> test2 etc). Unfortunately, when I attempt to do this I get heaps of warnings.

Using the toy data / input:

school <- list(J = 10, y = c(0.0, 0.7, 0.25, 0.3, 0.7, 0.4, 0.1, 0.15, 0.3, 0.35))
fit <- stan('schools2.stan', data = school, iter = 1e4, chains = 4, control = list(adapt_delta = 0.9999))

Where schools2.stan is:

data {
  int<lower=0> J;
  real y[J];
}

parameters {
  real mu;
  real<lower=0> tau;
  real<lower=0> dvar[J];
  real<lower=0, upper=1> theta[J];
}

model {
  tau ~ cauchy(0,0.2);
  dvar ~ cauchy(0,0.2);
  theta ~ normal(mu, dvar);
  y ~ normal(theta, tau);
}

> check_hmc_diagnostics(fit)

Divergences:
215 of 20000 iterations ended with a divergence (1.075%).
Try increasing 'adapt_delta' to remove the divergences.

Tree depth:
5219 of 20000 iterations saturated the maximum tree depth of 10 (26.095%).
Try increasing 'max_treedepth' to avoid saturation.

Energy:
E-BFMI indicated no pathological behavior.

If I ditch the horse because I can’t mount the saddle and try to just use something else like brms, I still get heaps (532) of divergent transitions:

d <- read.csv("test_data.csv", header=T)
mod <- brm(scores ~ (1 | tests), d, family=skew_normal())

Where test_data.csv corresponds to:

tests,scores
test1,0.0
test2,0.7
test3,0.25
test4,0.3
test5,0.7
test6,0.4
test7,0.1
test8,0.15
test9,0.3
test10,0.35

Am I doing anything obviously wrong? Thanks in advance.

Hi,
some pointers:

  • I would prefer using brms for this task - but its your call
  • Since your response are in the [0,1] range inclusive, you might want to use the zero_one_inflated_beta response (beta distribution is good to model response in [0,1] exclusive, but it seems you cannot rule out the boundaries).
  • Alternatively it might be preferable to work with the test items directly as the number of correct answers provides more structure than real percentages. IRT models (https://arxiv.org/abs/1905.09501) may be of interest, but the beta-binomial response might work as well.
  • mu has no prior which might be problematic
  • If working in Stan, look-up “non-centered parametrization”.

I’ve changed the name of the thread to more closely reflect the subject (I think little of the original 8 schools problem remains).

Hope that helps!

1 Like

Very grateful for your response Martin, you’ve given me a lot to think about :)

Also just realized that the brms model you’ve given has some (likely) more serious problems:

  • You don’t specify priors, but the default brms priors are often too wide
  • Wide priors might be problematic, because you only have one observation per group, so the data don’t inform the parameters much
  • Also since you use the skew_normal() distribution, you need even more data per group to identify the skew. Once again with the default wide priors, this is probably poorly identified.

Sorry I don’t have time to check any of this is actually happening, but those are IMHO the prime suspects for the divergences.

So I am fiddling around with this too in R and rstan. I am getting something similar to your errors

Divergences:
57 of 4000 iterations ended with a divergence (1.425%).
Try increasing ‘adapt_delta’ to remove the divergences.
Tree depth:
0 of 4000 iterations saturated the maximum tree depth of 14.
Energy:
E-BFMI indicated no pathological behavior…

when I run the model like this

fit ← stan(model_code= model1,
data = school,
chains=4, iter=2000, refresh=0,
seed=12345, cores = 6,
init = “random”,
control=list(stepsize=0.001, adapt_delta=0.9999, max_treedepth = 14)

Just to make sure I understand your parameter block

parameters {
real mu; // location
real<lower=0> tau; // measurement of noise
real<lower=0> dvar[J]; // unknown variance of test scores
real<lower=0, upper=1> theta[J]; // ??
}
EDIT: Figured out theta. Sorry!

So on your stan model I’d recommend loading up the bayesplot library and taking a look at the following plots

posterior_fit ← as.array(fit)
np_fit ← nuts_params(fit)
head(np_fit)

color_scheme_set(“darkgray”)
mcmc_parcoord(posterior_fit, np = np_fit, regex_pars = c(“dvar”))
mcmc_parcoord(posterior_fit, np = np_fit, regex_pars = c(“theta”))
mcmc_parcoord(posterior_fit, np = np_fit, regex_pars = c(“tau”,“theta”))

color_scheme_set(“mix-blue-pink”)
p ← mcmc_trace(posterior_fit, pars = c(“mu”, “tau”), n_warmup = 300,
facet_args = list(nrow = 2, labeller = label_parsed))
p + facet_text(size = 15)

mcmc_pairs(posterior_fit, np = np_fit, pars = c(“mu”,“tau”,“theta[1]”),
off_diag_args = list(size = 0.75))

So I think there is some issues with tau if I am reading these plots correctly?