Sum-to-zero and sbc

Recently @Bob_Carpenter proposed a sum-to-zero data type new stan data type zero sum vector (and you can see the lively discussion about this type). I wanted to see how the simple sum-to-zero constraint compares to the SVD using the wonderful SBC package by @hyunji.moon and @martinmodrak.

TL;DR

The results are the same but SVD is faster and shows no divergences.

Alpha is estimated fairly well but mu and sigma are not. This is actually expected because we’re not replicating the simulation! The constraint will pull mu toward the K - 1 estimates and the Kth needs to pull the entire sum to 0. The variance of alpha is all contained in the K - 1 estimates, though we see that the reduction in sigma is not too large.

Simple sum-to-zero

Stan: simple_ctr.stan (514 Bytes)
R: simple.R (1.4 KB)

cache_dir <- "./_simple_SBC_cache"
options(mc.cores = parallel::detectCores())
library(SBC)
# Enabling parallel processing via future
library(future)
library(cmdstanr)
plan(multisession)

if(!dir.exists(cache_dir)) {
  dir.create(cache_dir)
}
options(SBC.min_chunk_size = 5)

ilogit <- function(u) 1 / (1 + exp(-u))

simple_generator_single <- function(N){  # N is the number of data points we are generating
  K <- 5
  mu <- 1.3
  sigma <- 2.3
  alpha <- rnorm(K, mu, sigma)
  y <- matrix(NA, K, N)
  for (k in 1:K)
    y[k, ] <- rbinom(N, 1, ilogit(alpha[k]))

  list(
    variables = list(
      mean_alpha = mu,
      sd_alpha = sigma,
      mu = mu,
      sigma = sigma,
      alpha = alpha
    ),
    generated = list(K = K, N = N, y = y)
  )
}

mod_simple <- cmdstan_model("simple_ctr.stan")
set.seed(54882235)
n_sims <- 200  # Number of SBC iterations to run

simple_generator <- SBC_generator_function(simple_generator_single, N = 200)
simple_dataset <- generate_datasets(
  simple_generator, 
  n_sims)

simple_backend <- SBC_backend_cmdstan_sample(
  mod_simple, iter_warmup = 500, iter_sampling = 500, chains = 2, parallel_chains = 2)
results <- compute_SBC(simple_dataset, simple_backend , 
                       cache_mode = "results", 
                       cache_location = file.path(cache_dir, "results"))

 # - 57 (28%) fits had at least one Rhat > 1.01. Largest Rhat was 1.037.
 # - 3 (2%) fits had divergent transitions. Maximum number of divergences was 17.
#  - 6 (3%) fits had some steps rejected. Maximum number of rejections was 2.
# Not all diagnostics are OK.



Using SVD

Stan: hier_ctr_svd.stan (582 Bytes)
R: hier_ctr_svd.R (1.7 KB)

Using the same DGP as in the original post but with fewer groups to speed up computation.

hier_ctr_svd_generator_single <- function(N){  # N is the number of data points we are generating
  K <- 5
  mu <- 1.3
  sigma <- 2.3
  alpha <- rnorm(K, mu, sigma)
  y <- matrix(NA, K, N)
  for (k in 1:K)
    y[k, ] <- rbinom(N, 1, ilogit(alpha[k]))
  Sigma <- contr.sum(K) %*% diag(K - 1) %*% t(contr.sum(K))
  s <- svd((Sigma + t(Sigma))/2) # Guarantee symmetry
  s$d <- abs(zapsmall(s$d))
  
  list(
    variables = list(
      mean_alpha = mu,
      sd_alpha = sigma,
      mu = mu,
      sigma = sigma,
      alpha = alpha
    ),
    generated = list(K = K, N = N, y = y,
                     U = s$u,
                     d_raw = sqrt(s$d))
  )
}

Results



4 Likes