Hello, I’m new to rstan but i’ve used rstanarm before. I want to move to specifying the models in rstan format and I just want to make sure I’m on the right track here. I know i can get the stan code out of rstanarm but i feel like that confuses me for reason. Anyhow, what i want to do in rstanarm would look like this:
model ← stan_glmer(y ~ 1 + (1|age) + (1|regions) + sex
Sex is not a covariate, its a 2 factor fixed effect. Since its only 2 factors I’m not putting it as multilevel.
This is what I’ve done in rstan and its running fine but i want to make sure it is actually matching the rstanarm specification above.
data{
int<lower =0> N;
int<lower =1, upper=8> age[N]; //8 age groups
int<lower =1, upper= 12> regions[N]; //12 regions
int<lower =1, upper =2> sex[N];
vector[N] y;
}
parameters{
real alpha; //intercept
real<lower =0> sigma_a; // variance for ages
real<lower =0> sigma_r; // variance for regions
real<lower=0> epsilon; //variance for sex
real<lower =0> sigma_y; // model residual variance
vector<multiplier = sigma_a[8]> alpha_a; //vector for age intercepts
vector<multiplier = sigma_r[12]> alpha_r; //vector for regions intercepts
real mu_a; //mean of ages
real mu_r; //mean of regions
}
model {
vector[N] mu; //conditional mean
mu = alpha + alpha_a[age] + alpha_r[regions] + [epsilon,-epsilon][sex]';
//hyperpriors
mu_a ~ normal(0,1);
mu_r ~ normal(0,1);
sigma_a ~ normal(0,1);
sigma_r ~ normal(0,1);
sigma_y ~ normal(0,1);
// 2nd level likelihood
alpha_a ~ normal(mu_a, sigma_a);
alpha_r ~ normal(mu_r, sigma_r);
//level 1 likelihood
y ~ normal(mu, sigma_y)
}
Additional question, if i want to then do rstanarm in the following format:
model ← stan_glmer(y ~ 1 + (1|age) + (1|regions) + sex + (1|age:regions)
Would the rstan code for this model spec correspond to the following?
data{
int<lower =0> N;
int<lower =1, upper=8> age[N]; //8 age groups
int<lower =1, upper= 12> regions[N]; //12 regions
int<lower =1, upper =2> sex[N];
vector[N] y;
}
parameters{
real alpha; //intercept
real<lower =0> sigma_a; // variance for ages
real<lower =0> sigma_r; // variance for regions
real<lower=0> epsilon; //variance for sex
real<lower =0> sigma_y; // model residual variance
vector<multiplier = sigma_a[8]> alpha_a; //vector for age intercepts
vector<multiplier = sigma_r[12]> alpha_r; //vector for regions intercepts
real<lower=0> sigma_ar; //variance for age region intercepts
vector<multiplier = sigma_ar[96]> alpha_ar; //vectors for 8 age times 12 regions
real mu_a; //mean of ages
real mu_r; //mean of regions
real mu_ar; //mean for the interaction
}
model {
vector[N] mu; //conditional mean
mu = alpha + alpha_a[age] + alpha_r[regions] + [epsilon,-epsilon][sex]' + alpha_ar[age * regions];
//hyperpriors
mu_a ~ normal(0,1);
mu_r ~ normal(0,1);
mu_ar ~ normal(0,1);
sigma_a ~ normal(0,1);
sigma_r ~ normal(0,1);
sigma_y ~ normal(0,1);
sigma_ar ~ normal(0,1);
// 2nd level likelihood
alpha_a ~ normal(mu_a, sigma_a);
alpha_r ~ normal(mu_r, sigma_r);
alpha_ar ~ normal(mu_ar, sigma_ar);
//level 1 likelihood
y ~ normal(mu, sigma_y)
}
Finally, I’m not sure how to add a sex interaction to the above. So if i want to run the following in rstanarm:
model ← stan_glmer(y ~ 1 + (1|age) + (1|regions) + sex + (1|age:regions) + (1|regions:sex) + (1|age:sex)
How would i add the sex by regions and sex by age interactions following the epsilon controls? I took the epsilon from the rstan forums where it covers a multilevel regression MRP with sex being 2 factors, so in the code there it used the [epsilon, -epsilon][sex]’ but I’m not sure how do add this in the above model specs.
Hope the questions are ok! I’d like to move away from rstanarm so i can have a better control over priors specifications so I hope my understanding is on the right track. Any feedback, support and explanations would be appreciated! Thank you all!!