Hi there.
I have a four different models and would like to compare them.
I am working with Hierarchical Linear Models, so I have:
- Pooled Model
- Unpooled Model
as reference models (which should be outperformed).
and then
- Hierarchical Model with 1 Layer
- Hierarchical Model with 2 Layers
as models that should “probably” be better.
I have calculated the WAIC of the first 3 models and they’re exactly the same.
Now, I am not sure whether I did smth wrong or the WAIC is just the wrong measure to compare models. Basically I want to show with a measure (RMSE, IC, etc.) that model 3 and 4 are better but I don’t know what the industry standard is for Bayesian data analysis.
The 2nd question is: I have the proper Log-Likelihood for the first 3 stan models but don’t know how to properly right it out for the 4th one. I’d be happy if someone can help me out.
3rd question: I would like to perform out of sample predictions with LOO or even leaving out entire counties to predict them but this seems very tidious.
For once because I don’t know how to do it using the Stan model but also because there are certain counties that only have 1 obs. Leaving them out would mean that an entire county is out and would kinda destroy the purpose of what I wanted to do. Is there a possibility to limit which observations are chosen for the LOO?
Here’s the minimal R code:
## Load Data
data("radon", package = "rstanarm")
data <- radon
# County Categorical Variable
county_name <- as.vector(data$county) # create vector of county indications for each obs
county_list <- unique(county_name) # create list of counties, drop duplicates
J <- length(county_list) # number counties in Minessota
county <- rep (NA, J)
for (i in 1:J){
county[county_name==county_list[i]] <- i
}
# Seperate variables
logradon <- data$log_radon
loguranium <- data$log_uranium
loguranium <- loguranium[!duplicated(loguranium)]
floorind <- data$floor
N <- length(logradon)
county_i <- county
county_j <- 1:85
# Lists for STAN
cplist <- list(Y = logradon,
X = floorind,
N = N)
unplist <- list(Y = logradon,
X = floorind,
N = N,
J = J,
county = county)
datalist <- list(Y = logradon,
X = floorind,
U = loguranium,
N = N,
J = J,
county_i = county_i,
county_j = county_j)
Here’s the working Stan code for the 1st HLM:
// HLM 1
data {
int<lower=1> N; // number of obs
int<lower=1> J; // Number of counties
vector[N] Y; // outcome data (log radon)
vector[N] X; // Inputa data (floor)
int<lower=1> county[N]; //county ID variable
}
parameters {
vector[J] alpha; // county j intercept
real beta;
real mu_alpha; // prior on alpha
real <lower=0> sigma_alpha; // hyperparameter sigma of alpha
real<lower=0> sigma_y;// prior sigma of y in county j
}
model { // Model specifications
vector[N] mu_y;
// Full Model
for(i in 1:N)
mu_y[i] = alpha[county[i]] + beta * X[i];
// Priors and Hyperparams
Y ~ normal(mu_y, sigma_y);
sigma_y ~ cauchy (0, 5);
alpha ~ normal(mu_alpha, sigma_alpha);
mu_alpha ~ normal(0, 10);
sigma_alpha ~ cauchy(0, 5);
beta ~ normal(0, 10);
}
generated quantities {
vector[N] log_lik;
for (i in 1:N) log_lik[i] = normal_lpdf(Y[i]| alpha[county[i]] + beta * X[i], sigma_y);
}
Here’s the 2nd Stan code with not yet correct Log Likelihood:
(The model sigma is a complex term which is composed of sigma_y^2 and sigma_alpha^2.
How can I write that down?)
// HLM 2
data {
int<lower=1> N; // number of obs
int<lower=1> J; // Number of counties
vector[N] Y; // outcome data (log radon)
vector[N] X; // Inputa data Layer 1(room indication)
vector[J] U; // Input data Layer 2 (uranium)
int<lower=1> county_i[N]; //county ID variable for x
int<lower=1> county_j[J]; //county ID variable for u
}
parameters {
real beta; // fixed slope
vector[J] alpha; // county j varying intercept
real gamma_zero; // 2nd level fixed intercept
real gamma_one; // 2nd level fixed slope
real<lower=0> sigma_alpha; // prior on noise of 2nd layer model
real<lower=0> sigma_y;// prior on noise of 1st layer model
}
model { // Model specifications
vector[N] mu_y;
// 2nd Layer
for(j in 1:J)
alpha[county_j[j]] ~ normal(gamma_zero + gamma_one * U[county_j[j]], sigma_alpha);
// 1st Layer
for(i in 1:N)
mu_y[i] = alpha[county_i[i]] + beta * X[i];
// Priors and Hyperparams
Y ~ normal(mu_y, sigma_y);
sigma_y ~ cauchy (0, 2.5);
sigma_alpha ~ cauchy(0,2.5);
beta ~ normal(0, 5);
gamma_zero ~ normal(0, 5);
gamma_one ~ normal(0, 5);
}
generated quantities {
vector[N] log_lik;
for (i in 1:N)
for (j in 1:J)
log_lik[i] = normal_lpdf(Y[i]| gamma_zero + gamma_one * U[county_j[j]] + beta * X[i], sigma_y);
}