 # How to deal with a high F-ratio using 'rstanarm'?

Hello,

I noticed that rstanarm::stan_lmer fails to correctly estimate the variance components of a one-way ANOVA model with a random factor when the ratio of the between variance over the within variance is high.

``````mu           <- 10000 # grand mean
sigmaWithin  <- 1e-3
ratio        <- 100
sigmaBetween <- sigmaWithin * ratio
sigmaBetween^2
I            <- 10L # number of groups
J            <- 5L # sample size per group

set.seed(31415926L)
groupmeans <- rnorm(I, mu, sigmaBetween)
y          <- c(
vapply(groupmeans, function(gmean) rnorm(J, gmean, sigmaWithin), numeric(J))
)
dat        <- data.frame(
y = y,
group = gl(I, J)
)
stan <- stan_lmer(
y ~ (1|group), data = dat,
prior_aux = cauchy(0, 5)
)
posterior_interval(stan)
``````

How would you deal in such a situation? I found that increasing the scale parameter in `prior_covariance = decov(......)` yields better results.

This works with independent Cauchy priors on the standard deviations. But how to do with rstanarm?

Works:

``````// Learn more about model development with Stan at:
//
//    http://mc-stan.org/users/interfaces/rstan.html
//    https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started
//

data {
int<lower=1> N;
real y[N];
int<lower=1> I;
int<lower=1> groupID[N];
}

parameters {
real mu;
real alpha[I];
real<lower=0> sigma_error;
real<lower=0> sigma_group;
}

model {
mu ~ normal(0, 100);
sigma_error ~ cauchy(0, 5);
sigma_group ~ cauchy(0, 5);
for(i in 1:I){
alpha[i] ~ normal(0, sigma_group);
}
for(k in 1:N){
y[k] ~ normal(mu + alpha[groupID[k]], sigma_error);
}
}

generated quantities {
real<lower=0> sigma_total;
sigma_total = sqrt(sigma_group^2 + sigma_error^2);
}
``````

That scale parameter has a gamma prior that by default has a shape of 1 and a scale of 1 (i.e. is exponential). So, if you think values above a million are plausible, then you would need to adjust the shape and scale a bit.

Thank you. What is this scale parameter actually? Is it related to the ratio of the variances?

In the `(1 | g)` case, it is the standard deviation of the intercept across levels of `g`.

Ah yes, I get good results with `shape = sigmaBetween` and `scale = 1`. Thanks again.

Better with `shape = sigmaBetween / 10` and `scale = 10`. But the sampling is slower.