Recovering parameter

When I run the model below I am testing to see if I can recover the intercepts (4 and 1) and slopes (.7 and.1).

beta_group[1] and beta_group[2] are recovered but alpha_group[1] and alpha_group[2] I think should be near 4 and 1 but they are not.

  • How can the model be improved to get closer to 4 and 1 for alpha_groups?

  • When comparing the groups, would you use the alpha_group_tildes and beta_group_tildes in the same way you would use a coefficient from a non-bayesian random effect regression such as in lme()? i.e. group 1 has a greater effect if the random effect beta_group_tilde[1] is higher than beta_grou.p_tilde[2]?


set.seed(12)
n=200
group = c(rep("A",n/2),rep("B",n/2))
x = rep(1:(n/2),2)
y = ifelse(group=="A",4,1)+ifelse(group=="A",.7,.1)*x+rnorm(n,0,8)
data =data.frame(x=x, y = y, group= group)
data$group_numeric = as.numeric(as.factor(data$group))

library(rstan)
stan_data = list(N= nrow(data), 
                 N_group= length(unique(data$group_numeric)),  
                 X=data$x, 
                 group = data$group_numeric,
                 Y = data$y )
fit <- stan(file='model1.stan', data=stan_data, seed=4938483,
            control=list(adapt_delta=0.99),
            chains = 1, iter = 2000)
divergent <- get_sampler_params(fit, inc_warmup=FALSE)[[1]][,'divergent__']
sum(divergent)
params = extract(fit)
summary(fit, pars = c("mu_alpha", "sigma_alpha","alpha_group_tilde[1]","alpha_group_tilde[2]",
                      "alpha_group[1]","alpha_group[2]",
                      "mu_beta", "sigma_beta","beta_group_tilde[1]","beta_group_tilde[2]",
                      "beta_group[1]","beta_group[2]","sigma"
                      ), probs = c(0.1,.5 ,0.9))$summary
as.matrix(fit, pars = c("mu", "theta[1]"))
plot(  params$mu_beta,params$sigma_beta)
plot(  params$mu_alpha,params$sigma_alpha)
plot(  params$mu_alpha,params$alpha_group_tilde[,1]   )
plot(  params$mu_alpha,params$alpha_group_tilde[,2]   )

[mean      se_mean         sd         10%         50%       90%     n_eff      Rhat
mu_alpha              1.86998043 0.1716594437 2.08288446 -0.43542375  2.00954143 4.1541680  147.2295 0.9990060
sigma_alpha           2.25610525 0.1337213070 2.09238456  0.25207076  1.60309969 5.1208056  244.8397 1.0006503
alpha_group_tilde[1]  0.10870165 0.0555611119 0.87721371 -0.99521547  0.11216892 1.1731718  249.2694 1.0006616
alpha_group_tilde[2] -0.07434989 0.0564988807 0.85604809 -1.18322133 -0.04688849 0.9602683  229.5708 0.9991214
alpha_group[1]        2.30202328 0.0421064123 1.32905641  0.61024641  2.29260527 4.0002080  996.3005 1.0022701
alpha_group[2]        1.85615929 0.0470464422 1.32924564  0.17121427  1.86822643 3.5104792  798.2831 1.0006399
mu_beta               0.40400111 0.0583043856 0.82154128 -0.49205621  0.37810087 1.3174462  198.5438 1.0006910
sigma_beta            1.57899323 0.0902120449 1.42642497  0.39302852  1.08002122 3.4901870  250.0166 0.9998046
beta_group_tilde[1]   0.40701684 0.0472391069 0.66351107 -0.35931062  0.30654936 1.3341505  197.2845 0.9993304
beta_group_tilde[2]  -0.38508753 0.0397276918 0.61119641 -1.20291713 -0.33451997 0.3207698  236.6873 1.0017310
beta_group[1]         0.72709235 0.0007724913 0.02410129  0.69478405  0.72805639 0.7580075  973.4050 1.0048231
beta_group[2]         0.08530329 0.0007091771 0.02334811  0.05616872  0.08480867 0.1138369 1083.9118 0.9992051
sigma                 7.55519817 0.0176743278 0.37910197  7.10420755  7.53768298 8.0443860  460.0725 1.0031440

model

data{
  int<lower=1> N;
  int<lower=1> N_group; 
  int<lower=1, upper =N_group> group[N];
  vector[N] X;
  real Y[N]; 
}
parameters{
  real mu_alpha; 
  real<lower=0> sigma_alpha;
  vector[N_group] alpha_group_tilde; 
  real mu_beta; 
  real<lower=0> sigma_beta;
  vector[N_group] beta_group_tilde; 
  real<lower=0> sigma;
}

transformed parameters{
  vector[N_group] alpha_group = mu_alpha+sigma_alpha*alpha_group_tilde;
  vector[N_group] beta_group = mu_beta+sigma_beta* beta_group_tilde;
}
model{
  mu_alpha~normal(0,5);
  sigma_alpha~normal(0,5);
  alpha_group_tilde~normal(0, 1);

  mu_beta~normal(0,2);
  sigma_beta~normal(0,5);
  beta_group_tilde~normal(0, 1);
  
  sigma ~ normal(0,20);
 
  for(n in 1:N){
  Y[n] ~ normal(alpha_group[group[n]]+beta_group[group[n]]*X[n],sigma);
  }
}
generated quantities{
  vector[N] y_ppc;
  for(n in 1:N){
    y_ppc[n] = normal_rng(alpha_group[group[n]]+beta_group[group[n]]*X[n],sigma);
  }
}




1 Like

I’m a little confused. Your data generation suggests two groups, each with potentially different slopes and intercepts but a common noise term. So you should have 5 parameters, but you have lots more, making the model unidentified I think.

1 Like

This is following the non centered parameterization for hierarcical models that has additional normal(0,1) variables.

Where is the hierarchy in your model? So far as I can tell there are two groups, each of which has single observations at multiple points for the variable X, so there’s no hierarchy. If you had multiple individual units of observation within each group, where each unit had observations at multiple points for the variable X, then you could fit a hierarchical model with slopes/intercepts varying randomly within group. But as is, your data is not hierarchical and thus the model is unidentified.

Thank you for your response. Here I believe the group coefficients come from a population so I coded it that way. I haven’t seen the requirement that hierarchical/multilevel models need to have multiple observations at a particular point on an X variable. Can you provide a reference for that if possible?

This is a generic model. Imagine Y is a measure of happiness and X is the number of times a person (group A or B) sees a red car on a day. People may see the same number of red cars on different days or they may not. Here they see unique values. I think this model, with coefficients assumed to come from a population, falls into the hierarchical/multi-level space.

Ah, two groups is a little sparse to benefit from hierarchical pooling without some pretty strongly informed priors. I couldn’t see anything explicitly wrong with your model, but to double-check I ran it using rstanarm::stan_glmer():


library(rstanarm)
fit = stan_glmer(
  formula = y ~ x + (x|group)
  , data = data
  , cores = 4
  , chains = 4
)
samples = data.frame(fit)
summary(samples$"b..Intercept..group.A.")
summary(samples$"b..Intercept..group.B.")
summary(samples$"b.x.group.A.")
summary(samples$"b.x.group.B.")

Which similarly fails to recover the group-level parameters. So I’d say your problem is that for only 2 groups, your data and priors don’t have enough information to fit a full hierarchical model. If this is a toy example and you will have many more groups, then try simulating that scenario. If you truly only have two groups and were trying a hierarchical model for potential pooling benefits, you can achieve the same kind of thing by parameterizing the model with a mean_intercept parameter and a intercept_group_difference parameter then having a fairly tight prior around zero for the latter (and ditto for the slopes).

2 Likes