Hello fellow stan users!
I am doing a hierarchical analysis of behavioral data (mouse behavior) with 3 levels: individual, group and population. The model requires to estimate 2 parameters (U and S) modeled in log and logit space. The response variable is log-transformed and modeled with a normal likelihood. My question is related to how to model the standard deviation of the normal distribution. My initial approach was to model the observation-level standard deviation hierarchically (subject → group → population). However, this resulted in poor identification, the higher-level SD increased to absorb most of the variability, making the likelihood weakly sensitive to the main parameters (U and S). My second approach was to model the SD non-hierarchically, letting the sampler just pick the best SD estimate for each subject.
The second approach was successful, offering no divergent transitions, reasonable parameter estimates and good r hat values, but I was wondering if it is reasonable to do so and if there is a more principled alternative.
I include the full model below, but the question is specifically about modeling the residual SD:
Thank you for any help provided!
Eq13_code <- "
data {
int<lower=1> N; // Number of observations
int<lower=1> S; // Number of subjects
int<lower=1, upper=S> subject[N]; // Subject IDs
real B[N]; // Response variable (observed B)
real w[N]; // Known parameter w
real t[N]; // Known parameter t
real N_val[N]; // Known parameter N
real H[N]; // Known parameter H
real q[N]; // Known parameter q
int<lower=1, upper=2> G[S]; // Group scm or eth (subject-specific)
}
parameters {
real U_log_pop; // Population mean for log(U)
real<lower=0> sigma_U; // Population standard deviation for U
real U_log_group[2]; // Group-level means for log(U)
real U_log_offset[S]; // Subject-specific deviations for log(U)
real S_logit_pop; // Population mean for logit(S)
real<lower=0> sigma_S; // Population standard deviation for logit(S)
real S_logit_group[2]; // Group-level means for logit(S)
real S_logit_offset[S]; // Subject-specific deviations for logit(S)
vector<lower=0>[S] sigma_B; // Subject-specific residual standard deviations
}
transformed parameters {
real U_subject[S]; //
real S_subject[S]; //
for (s in 1:S) {
U_subject[s] = U_log_group[G[s]] + U_log_offset[s] * sigma_U; //
S_subject[s] = S_logit_group[G[s]] + S_logit_offset[s] * sigma_S; //
}
}
model {
// Priors for population-level parameters
target += normal_lpdf(U_log_pop | 0, 50);
target += normal_lpdf(sigma_U | 0, 10) - log(0.5); //
target += normal_lpdf(S_logit_pop | 0, 50);
target += normal_lpdf(sigma_S | 0, 10) - log(0.5); //
// Priors for group-level parameters
target += normal_lpdf(U_log_group | U_log_pop, sigma_U);
target += normal_lpdf(S_logit_group | S_logit_pop, sigma_S);
// Priors for subject-level parameters (Non-centered)
target += normal_lpdf(U_log_offset | 0, 1);
target += normal_lpdf(S_logit_offset | 0, 1);
target += normal_lpdf(sigma_B | 0, 10) - log(0.5); //
// Likelihood (Using Gaussian Approximation)
for (n in 1:N) {
int s = subject[n];
real sigma = sigma_B[s];
real Pn_star = 1 / (t[n] + (N_val[n] / (w[n] * q[n])));
real Pn = (exp(U_subject[subject[n]]) * (Pn_star ^ inv_logit(S_subject[subject[n]]))) ^ (1 / (1 - inv_logit(S_subject[subject[n]])));
real B_hat =log(N_val[n]) - log((t[n] * q[n]) + (N_val[n] / w[n])) - log(1 + 1 / Pn);
target += normal_lpdf(B[n] | B_hat, sigma);
}
}
"
data$Subject <- as.factor(data$Subject)
data_subject <- data %>%
group_by(Subject) %>%
summarise(G = first(G))
stan_data <- list(
N = nrow(data),
S = length(unique(data$Subject)),
subject = as.numeric(data$Subject),
B = data$B,
N_val = data$N_val,
w = data$w,
q = data$q,
t = data$t,
H = data$H,
G = data_subject$G
)
stan_model <- stan_model(model_code = Eq13_code)
Phase3_eq13<- sampling(
stan_model,
data = stan_data,
iter = 4000,
chains = 4,
cores = 4,
control = list(adapt_delta = 0.99, max_treedepth = 20)
)