For the past couple months I have been investigating what is causing college football attendance to drop. I am trying to make a hierarchical model and I am encountering the stated error when I am running it. I am new to STAN and have been trying to debug it without luck.
Here is the model that I have built:
// Index values, observations and covariates
data {
int<lower = 1> N; // Number of games
int<lower = 1> K; // Number of teams
int<lower = 1> I; // Number of team covariates
real fill_rate[N]; // Vector of game observations
int<lower = 1, upper = K> g[N]; // Vector of team assignments
int team[N]; // Vector of team covariates
vector[N] wins; // Vector of current wins covariates
real gamma_mean; // Mean for the hyperprior on gamma.
real<lower = 0> gamma_var; // Variance for the hyperprior on gamma.
real<lower = 0> tau_min; // Minimum for the hyperprior on tau.
real<lower = 0> tau_max; // Maximum for the hyperprior on tau.
real<lower = 0> sigma_min; // Minimum for the hyperprior on tau.
real<lower = 0> sigma_max; // Maximum for the hyperprior on tau.
}
parameters {
matrix[K, (I - 1)] alpha; // football game level coefficients
vector[K] beta; // Vector of observation-level wins coefficients
real gamma; // Mean of the model
real<lower=0> tau; // Variance of the population model
real<lower=0> sigma; // Variance of the observation model
}
model {
// Hyperpriors
vector[N] mu;
// Hyperpriors and prior.
gamma ~ normal(gamma_mean, gamma_var);
tau ~ uniform(tau_min, tau_max);
sigma ~ uniform(sigma_min, sigma_max);
// Population model and likelihood.
for (k in 1:K) {
alpha[k,] ~ normal(gamma, tau);
beta[k] ~ normal(gamma, tau);
}
for (n in 1:N) {
mu[n] = alpha[g[n], team[n]] + beta[g[n]] * wins[n];
}
fill_rate ~ normal(mu, sigma);
}
// Generate predictions using the posterior.
generated quantities {
vector[N] mu_pc; // Declare mu for predicted linear model.
real fill_rate_pc[N]; // Vector of predicted observations.
// Generate posterior prediction distribution.
for (n in 1:N) {
mu_pc[n] = alpha[g[n], team[n]] + beta[g[n]] * wins[n];
fill_rate_pc[n] = normal_rng(mu_pc[n], sigma);
}
}
Using the above model I fit it with the following code.
# Specify data.
data <- list(
N = nrow(CFB), # Number of observations.
K = max(CFB$Team_index), # Number of groups.
I = max(CFB$Team_index) + 1, # Number of observation-level covariates.
fill_rate = CFB$`Fill Rate`, # Vector of observations.
g = CFB$Team_index, # Vector of group assignments.
team = CFB$Team_index, # Vector of team covariates.
wins = CFB$`Current Wins`, # Vector of wins covariates.
gamma_mean = -.03, # Mean for the hyperprior on gamma.
gamma_var = .06, # Variance for the hyperprior on gamma.
tau_min = 0, # Minimum for the hyperprior on tau.
tau_max = .00875, # Maximum for the hyperprior on tau.
sigma_min = 0, # Minimum for the hyperprior on tau.
sigma_max = .00875 # Maximum for the hyperprior on tau.
)
# Calibrate the model.
model03 <- stan(
file = here::here("Projects", "Code", "StanModel.stan"),
data = data,
control = list(adapt_delta = 0.99),
chains = 1,
seed = 42
)
When I try to run the code I get the following error:
Chain 1: Rejecting initial value:
Chain 1: Log probability evaluates to log(0), i.e. negative infinity.
Chain 1: Stan can’t start sampling from this initial value.
Chain 1:
Chain 1: Initialization between (-2, 2) failed after 100 attempts.
Chain 1: Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”
[1] “error occurred during calling the sampler; sampling not done”