Bulk/Tail ESS Warnings in Bradley-Terry Model

rja · November 11, 2021, 11:23pm

I am trying to estimate individual abilities from the results of head-to-head comparisons similar to Bob Carpenter and Andrew Gelman’s examples. I’m consistently getting Bulk ESS and Tail ESS errors on the ability parameters and was curious if the board had any suggestions. I’ve tried both centered and non-centered parameterizations as well as adding more iterations with no luck.

The stan code is below.

data {
  int<lower = 1> N; //number of observations
  int<lower = 1> I; //number of individuals
  int<lower = 0, upper = 1> y[N]; //
  int<lower = 1, upper = I> id_1[N];
  int<lower = 1, upper = I> id_2[N];
}
parameters {
  real<lower = 0> sigma;
  vector[I] alpha;
}
model{
  alpha ~ normal(0, sigma);
  sigma ~ normal(4, 2);
  y ~ bernoulli_logit(alpha[id_1] - alpha[id_2]);
}

And here is code to simulate data and fit the model

library(dplyr)
library(rstan)

n <- 100
sigma <- 4

id <- 1:n

ability <- rnorm(n, 0, sigma)

ability_df <- data.frame(id, ability)

matchup_grid <- expand.grid(id_1 = id, id_2 = id)
matchup_grid <- matchup_grid[matchup_grid$id_1 > matchup_grid$id_2,]

model_dat <- matchup_grid %>%
  inner_join(ability_df, by = c('id_1' = 'id')) %>%
  rename(ability_1 = ability) %>%
  inner_join(ability_df, by = c('id_2' = 'id')) %>%
  rename(ability_2 = ability)

stan_dat <- list(N = nrow(model_dat),
                 I = n,
                 y = model_dat$id_1_preferred,
                 id_1 = model_dat$id_1,
                 id_2 = model_dat$id_2)

model_dat$id_1_preferred <- rbinom(nrow(model_dat), 1, arm::invlogit(model_dat$ability_1 - model_dat$ability_2))

mod_sim_cen <- stan_model(model_code = stan_code_cen)
fit_sim_cen <- sampling(mod_sim_cen, stan_dat, iter = 2000, chains = 4, cores = 4)

andre.pfeuffer · November 12, 2021, 5:58am

Is there a reason, why do you specify?

Try sigma ~ normal(0, 2);

You may also use:

{
  vector[I] alpha_n = alpha * sigma;
  alpha ~ std_normal();
  sigma ~ normal(0, 2);
  y ~ bernoulli_logit(alpha_n[id_1] - alpha_n[id_2]);
}

maw · December 8, 2021, 4:59pm

Using Andre’s reparameterization I am still getting the bulk ess and tail ess issues.

The bizarre behavior I found is that if you change the line that gets the matchup grid to sample only 25% of the data:

matchup_grid <- matchup_grid[matchup_grid$id_1 > matchup_grid$id_2,] %>% sample_frac(.25)

It gets rid of the ess issues. Why would removing a significant portion of sample size help the model?

andre.pfeuffer · December 9, 2021, 4:09am

data {
  int<lower = 1> N; //number of observations
  int<lower = 1> I; //number of individuals
  int<lower = 0, upper = 1> y[N]; //
  int<lower = 1, upper = I> id_1[N];
  int<lower = 1, upper = I> id_2[N];
}
parameters {
  real<lower = 0> sigma;
  vector[I] alpha;
  real intercept;
}
model{
  alpha ~ normal(0, sigma);
  sigma ~ lognormal(0, 0.5);
  intercept ~ std_normal();
  y ~ bernoulli_logit(alpha[id_1] - alpha[id_2] + intercept);
}

> fit$cmdstan_diagnose()
Processing csv files: /tmp/RtmpB1EblK/bradleyterry-202112091102-1-737976.csv, /tmp/RtmpB1EblK/bradleyterry-202112091102-2-737976.csv

Checking sampler transitions treedepth.
Treedepth satisfactory for all transitions.

Checking sampler transitions for divergences.
No divergent transitions found.

Checking E-BFMI - sampler transitions HMC potential energy.
E-BFMI satisfactory.

Effective sample size satisfactory.

Split R-hat values satisfactory all parameters.

Processing complete, no problems detected.

Changed to sigma ~ normal(0, 2) to lognormal(0, 0.5); to keep sigma away from zero.
Added intercept parameter.

maw · December 9, 2021, 4:40pm

Interesting…I’m still getting bulk ess issues.

As for the intercept, why would that help sample better? How would that parameter be interpreted in the model.

Are you using cmdstan instead of RStan?

andre.pfeuffer · December 10, 2021, 5:21am

I’m using cmdstanr/cmdstan 2.28.2.
Using rstan: If you increase the number of iterations , the warnings disappear.
The intercept terms helps to have the overall alpha's too be centered around zero.

maw · December 13, 2021, 4:23pm

Any thoughts on why rstan vs cmdstan would be different?

In the model the alphas represent the latent talent between the two players. What would the intercept mean as an adjustement to the difference in talent level? especially if we are centering alpha around zero?

andre.pfeuffer · December 14, 2021, 3:18am

cmdstanr is a more basic interface than rstan. These are diagnostic warnings based on some measurements. I am certain you can do those checks in cmdstanr.

It can be interpreted as home advantage.
But there is more reason for adding an intercept:
when-is-it-ok-to-remove-the-intercept-in-a-linear-regression-model

Topic		Replies	Views
Question about bulk and tail ESS General	5	477	October 8, 2020
Low Bulk and Tail ESS for Intercept Only Modeling fitting-issues , specification , brms	5	910	December 31, 2020
How to address warnings about low ESS and low BFMI? Modeling	35	1173	February 14, 2024
Low ESS and High RHat for Random Intercept & Slope Simulation (rstan and rstanarm) Modeling	8	2784	August 2, 2019
Low effective sample size after running Bayesian cognitive model in Stan Modeling rstan , fitting-issues	8	801	August 18, 2021

Bulk/Tail ESS Warnings in Bradley-Terry Model

Related topics