Hi, I am using rstan and running into an Initialization failure that I cannot understand. I am trying to fit a model to some grouped count data that depend on age.
The data in R is the following
meanAge <- c(4.5, 14.5, 24.5, 34.5, 44.5, 54.5, 64.5, 74.5, 84.5, 90.0,
38.5, 64.5, 74.5, 85.0, 10.0, 24.5, 34.5, 44.5, 54.5, 64.5,
74.5, 85.0 , 4.5, 14.5, 24.5, 34.5, 44.5, 54.5, 64.5, 74.5,
84.5, 90.0 , 4.5, 14.5, 24.5, 34.5, 44.5, 54.5, 64.5, 74.5,
85.0, 2.0, 9.5, 19.5, 29.5, 39.5, 49.5, 59.5, 69.5, 79.5)
studyNum <- c(4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 2, 2, 2, 2, 5, 5, 5, 5, 5, 5, 5,
5, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6,
1, 1, 1, 1, 1, 1, 1, 1, 1)
Deaths <- c(0, 0, 4, 8, 22, 53, 84, 145, 170, 67, 134, 241, 572,
1280, 5, 24, 89, 208, 632, 1542, 2844, 4857, 0, 0,
0, 0, 0, 149, 504, 1656, 2698, 1162, 2, 6, 33, 73,
277, 922, 2515, 6436, 17474, 0, 0, 1, 5, 12, 24, 67, 225, 514)
Hosp <- c(26, 8, 97, 211, 352, 515 , 533, 451, 313, 128, 2896, 1621,
2158, 3346, 789, 2149, 4660, 6303, 9724, 12236, 13543, 15370, 54, 39,
180, 351, 862, 2072, 2803, 4037, 3816, 1314, 313, 339, 1911, 4722,
10353, 16553, 19712, 23243, 29517, 24, 18, 77, 198, 274, 451, 499,
587, 746)
letalityData <- list(N=length(studyNum), K=length(unique(studyNum)),
group=studyNum, ageVec=meanAge, successes=Deaths,
totalCount=Hosp, propVec=Deaths/Hosp))
I want to model the number of deaths among hospitalized people as a binomial process, using a generalized linear model where age is the predictor. I also want to allow the intercept and slope to vary among studies.
data {
int<lower=0> N; // number of observations
int<lower=0> K; // number of groups
int<lower=1, upper=K> group[N];
vector[N] ageVec; // predictor
int successes[N];
int totalCount[N];
}
parameters {
vector[K] groupSlope; // vector with the slopes of each study
vector[K] groupIntercept; // vector with intercept of each study
real ageSlope; // mean slope across studies
real intercept; // mean intercept across studies
real<lower=0> ageSlopeSigma;
real<lower=0> interceptSigma;
}
transformed parameters{
vector[N] successProb; // define the probability of death
successProb = Phi(groupIntercept[group] + groupSlope[group] .* ageVec);
}
model {
// Priors
intercept ~ normal(0, 2);
interceptSigma ~ exponential(1);
ageSlope ~ normal(0, 1);
ageSlopeSigma ~ exponential(1);
// Individual studies parameters
groupIntercept ~ normal(intercept, interceptSigma);
groupSlope ~ normal(ageSlope, ageSlopeSigma);
// binomial model
successes ~ binomial(totalCount, successProb);
}
When I try to fit the above model to the data, I get the following error
[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”
error occurred during calling the sampler; sampling not done
I’ve tried a lot of things to no avail. I think the most informative one is that if I try to fit directly to the proportion of deaths with a normal distribution everything works. For that I just add the propVec variable to the data, add a residualSigma to the parameters, and substitute the last Stan line of the model for:
residualSigma ~ exponential(1);
propVec ~ normal(successProb, residualSigma);
Also, the model works fine when I only allow the intercepts to vary among groups, but it does not work if only the slopes vary among groups. I would greatly appreciate some help, since I’m new to Stan.
Thanks