Simple question: logistic regression with group level covariate and non-centered parameterization

Diego · November 14, 2018, 1:04am

Hi everyone!

New to Stan and Bayes with a little experience with JAGS. I’m working in preliminary version of a mode I will extend and make more complex as I understand more about the functionality. On the way I had several warnings that I could resolved following the recommendations in the manual. Among them, the non-centered parameterization. I have just 5 groups and a0 is the reference level for the other 2 fixed covariates. Simple question here

How do I add a group level covariate? I tried some alternatives I thought plausible but the results are plain different compare to the model I set up in Jags.

data {
  int N;
  int M; 
  int index_sites [N];
  int wean_mate [N]; //predictor
  int pregnancy [N]; //predictor
  int CoV_eid[N]; //outcome 
}                                                

parameters {
  real a0; 
  real b_wean_mate;
  real b_pregnancy;
  real<lower=0,upper=10> sigma_theta; 
  vector[M] epsilon;
  
}

transformed parameters {
  vector[M] theta_site;
  theta_site = sigma_theta * epsilon; // equivalent to normal(0,sigma_theta)
  
}

model {
  vector[N] eta;
  a0 ~ normal(0,80);         // standard deviation
  b_pregnancy ~ normal(0,80);
  b_wean_mate ~ normal(0,80);
  
  sigma_theta ~ normal(0,3); //cauchy(0,5) works also
  epsilon~normal(0,1);
  
  for(i in 1:N)
    eta[i] = a0 + b_wean_mate * wean_mate[i] + b_pregnancy * pregnancy[i] + theta_site[index_sites[i]];
  CoV_eid ~ bernoulli_logit(eta); //model
  
}

something in the transformed parameter block such as


data {
  real  data_group_covariate[M];

parameters {
 ...
 real b_group_covariate; 
}


transformed parameters {
  vector[M] theta_site;
  theta_site = b_group_covariate*data_group_covariate + sigma_theta * epsilon; // equivalent to normal(0,sigma_theta)
  
}

model {
 ...
 b_group_covariate~normal(0,80)
}

Thanks for the feedback.

bbbales2 · November 14, 2018, 4:42am

Assuming you have M groups and N data points and that index_sites is the group label for each data point, then I’d assume you’d want to add to eta[i] a term that looked like b_group_covariate * data_group_covariate[index_site[i]].

I think it’s usually easier to write out models in their centered parameterization and then convert. You’re right that adding b_group_covariate * data_group_covariate does something strange to the implied prior on the theta_site. It’s easiest to write everything out and then see what can be non-centered (or what even needs to be non-centered) rather than trying to write it from the ground up.

Diego · November 14, 2018, 9:23pm

Hi Ben:

Thank you very much for the response. I tried by setting up the mu in a following step in the model block and now the sampler does work and I get results that are decently similar to the ones I get from JAGS; but not exactly to be 100% relax about the model parameterization.

I’m posting this code. I do think is right, but definetely a check would be awesome. Thank you in advance

data {
int N;
int M;
int index_sites [N];
int wean_mate [N];
int pregnancy [N];
int data_group_covariate[M]
int CoV_eid[N];
}

parameters {
real a0;
real b_wean_mate;
real b_pregnancy;
real b_group_covariate;
real<lower=0, upper=10> sigma_theta;
vector[M] epsilon;
real mu;

}

transformed parameters {
vector[M] theta_site;
theta_site = mu + sigma_theta * epsilon;
}

model {
vector[N] eta;
a0 ~ normal(0,80);
b_wean_mate ~ normal(0,80);
b_pregnancy ~ normal(0,80);
b_group_covariate ~ normal(0,80);

sigma_theta ~ normal(0,3);
epsilon~normal(0,1);

for(j in 1:M)
mu~normal(b_group_covariate*data_group_covariate[j],3);

for(i in 1:N)
eta[i] = a0 + b_wean_mate * wean_mate[i] + b_pregnancy * pregnancy[i] + theta_site[index_sites[i]];
CoV_eid ~ bernoulli_logit(eta); //model
}

bbbales2 · November 17, 2018, 10:44pm

Oh no! Sorry for the delay. I missed this. Either way, this looks strange to me:

for(j in 1:M)
   mu ~ normal(b_group_covariate*data_group_covariate[j],3);

If you write that out it is:

mu ~ normal(b_group_covariate * data_group_covariate[1],3);
mu ~ normal(b_group_covariate * data_group_covariate[2],3);
...
mu ~ normal(b_group_covariate * data_group_covariate[M],3);

which is multiple priors on the variable which seems wrong.

Did you mean to declare mu to be a vector of length M and then do:

for(j in 1:M)
  mu[j] ~ normal(b_group_covariate*data_group_covariate[j],3);

perhaps?

Diego · November 18, 2018, 2:10am

That’s correct!

thanks Ben!

bbbales2 · November 18, 2018, 8:43pm

Does it match your previous fit better now? It’s probably worth doing a simulated data experiment to make sure this model is fitting the trends you want to detect to debug it. That make sense?

Diego · November 19, 2018, 12:37am

It does match now closely to the previous fit. I now what you mean with the simulated data. I think the way it is now makes me confident of the results.

Topic		Replies	Views
Difficulty specifying non-centered varying slopes model Modeling rstan , specification	5	623	September 29, 2021
Simple question on hierarchical non-centered parameterization Modeling	3	2026	October 16, 2017
Non-centered parameterization of a hierarchical negative binomial model in Stan Modeling rstan , specification	2	648	September 7, 2021
Adaptation and non-centered parameterisations Developers	10	1179	August 2, 2021
Specification of multivariate hierarchical priors Modeling specification	1	507	June 9, 2019

Simple question: logistic regression with group level covariate and non-centered parameterization

Related topics