How to deal with a high F-ratio using 'rstanarm'?

Hello,

I noticed that rstanarm::stan_lmer fails to correctly estimate the variance components of a one-way ANOVA model with a random factor when the ratio of the between variance over the within variance is high.

mu           <- 10000 # grand mean
sigmaWithin  <- 1e-3
ratio        <- 100
sigmaBetween <- sigmaWithin * ratio
sigmaBetween^2
I            <- 10L # number of groups
J            <- 5L # sample size per group

set.seed(31415926L)
groupmeans <- rnorm(I, mu, sigmaBetween)
y          <- c(
  vapply(groupmeans, function(gmean) rnorm(J, gmean, sigmaWithin), numeric(J))
)
dat        <- data.frame(
  y = y,
  group = gl(I, J)
)
stan <- stan_lmer(
  y ~ (1|group), data = dat, 
  prior_aux = cauchy(0, 5)
)
posterior_interval(stan)

How would you deal in such a situation? I found that increasing the scale parameter in prior_covariance = decov(......) yields better results.

This works with independent Cauchy priors on the standard deviations. But how to do with rstanarm?

Works:

// Learn more about model development with Stan at:
//
//    http://mc-stan.org/users/interfaces/rstan.html
//    https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started
//

data {
  int<lower=1> N;
  real y[N];
  int<lower=1> I;
  int<lower=1> groupID[N];
}

parameters {
  real mu;
  real alpha[I];
  real<lower=0> sigma_error;
  real<lower=0> sigma_group;
}

model {
  mu ~ normal(0, 100);
  sigma_error ~ cauchy(0, 5);
  sigma_group ~ cauchy(0, 5);
  for(i in 1:I){
    alpha[i] ~ normal(0, sigma_group);
  }
  for(k in 1:N){
    y[k] ~ normal(mu + alpha[groupID[k]], sigma_error);
  }
}

generated quantities {
  real<lower=0> sigma_total;
  sigma_total = sqrt(sigma_group^2 + sigma_error^2);
}

That scale parameter has a gamma prior that by default has a shape of 1 and a scale of 1 (i.e. is exponential). So, if you think values above a million are plausible, then you would need to adjust the shape and scale a bit.

Thank you. What is this scale parameter actually? Is it related to the ratio of the variances?

In the (1 | g) case, it is the standard deviation of the intercept across levels of g.

Ah yes, I get good results with shape = sigmaBetween and scale = 1. Thanks again.

Better with shape = sigmaBetween / 10 and scale = 10. But the sampling is slower.