QR decomposition of parameters for multilevel regression?

bbbales2 · October 20, 2020, 1:59pm

Yeah if the only difference between your two models is one is non-centered and one is centered, and the centered one works better with big data and the non-centered with small data, this happens.

There’s two different problems that happen. Here’s code to plot the problem and examine it on the eight schools dataset:

eight_schools_centered.stan (334 Bytes) eight_schools_noncentered.stan (357 Bytes)

library(rstan)

data_low = list(J = 8,
            y = c(28,  8, -3,  7, -1,  1, 18, 12),
            sigma = c(15, 10, 16, 11,  9, 11, 10, 18))

# really small measurement errors (small sigma) is like having lots of data
data_high = list(J = 8,
            y = c(28,  8, -3,  7, -1,  1, 18, 12),
            sigma = 0.01 * c(15, 10, 16, 11,  9, 11, 10, 18))


cmod = stan_model("eight_schools_centered.stan")
ncmod = stan_model("eight_schools_noncentered.stan")

fit_clow = sampling(cmod, data_low) # centered with low data, divergences
fit_chigh = sampling(cmod, data_high) # centered with lots of data, good

fit_nclow = sampling(ncmod, data_low) # non-centered low data, good
fit_nchigh = sampling(ncmod, data_high) # non-centered high data, treedepth problems

# plot log_tau instead of tau because log_tau is what the sampler sees
pairs(fit_clow, pars = c("theta[1]", "theta[3]", "mu", "log_tau"))
pairs(fit_chigh, pars = c("theta[1]", "theta[3]", "mu", "log_tau"))

# In the non-centered case the sampler works in z not in theta
pairs(fit_nclow, pars = c("z[1]", "z[3]", "mu", "log_tau"))
pairs(fit_nchigh, pars = c("z[1]", "z[3]", "mu", "log_tau"))