Rhat of one estimate always close to 1

I’ve noticed a perplexing trend with the rhat in my simulation that I would really like to decipher. It seems that for a very small trial with a binomial outcome, the rhat for the last specified treatment effect almost always is very close to 1, whereas the other two treatment effects will show the convergence issues.

For example, I can simulate a dataset where all responses are 0, (5 observations per cluster, 5 clusters per group, and 4 groups) and when modelling the treatment effects (beta) the Rhat will still be 1 for beta_trt[3] but high for beta_trt[1] and beta_trt[2].

For context, definitely not preferable to run a model with this little data, but I am trying to figure out how small a clustered trial can be until the model ‘breaks’, and how it breaks.

I’ve tried this with and without a QR decomposition with the same result.

If anyone can think of why this would occur I’d really appreciate the insight.

An example of the output I’m getting:

Here is the code I’ve used to run the stan model, (via R) and a generated dataset. If I haven’t uploaded the data in a way that works let me know and I’ll try another method.
example_data.csv (873 Bytes)

    resp <- as.vector(example_data[,4])
    N_obs <- dim(example_data)[1]
    N_site <- length(unique(example_data$site))
    N_trt_groups <- length(unique(example_data$trt))
    data <- list(N_obs = N_obs, N_site = N_site, N_trt_groups = N_trt_groups, 
                 site = example_data$site, trt = as.numeric(example_data$trt), resp = resp)

mod$sample(
      data = data, 
      init = 0,
      iter_warmup = 250,
      iter_sampling = 250,
      chains = 4, 
      parallel_chains = 1,
      adapt_delta = 0.99,
      refresh = 0, 
      max_treedepth=12)


 data {
  int<lower=1> N_obs; //number of observations
  int<lower=1> N_site; //number of sites
  int<lower=1> N_trt_groups; //number of treatment groups

  array[N_obs] int site; //site ids
  array[N_obs] int trt; //trt group
  array[N_obs] int resp; //outcome 

}
transformed data { 
    matrix[N_obs, N_trt_groups-1] X_matrix = rep_matrix(0, N_obs, N_trt_groups-1);
    matrix[N_obs, N_site] Z_matrix = rep_matrix(0, N_obs, N_site);
    for (i in 1:N_obs){
      for (j in 2:N_trt_groups){
        if (j==trt[i]) X_matrix[i,j-1] = 1;
      }  
      for(k in 1:N_site)    
      if (k==site[i]) Z_matrix[i,k] = 1;
    }


}
parameters {
  real b0;
  vector[N_trt_groups-1] beta_trt; //beta coefficients for each treatment group
  vector[N_site] alpha_site_raw; //random effect coefficients for site
  cholesky_factor_corr[N_site] lkj_corr; //random effect correlation 
}

transformed parameters {   
 vector[N_site] b0_site = b0 + 0.2*lkj_corr * alpha_site_raw;//implies: random intercept for site is sampled from multivariate normal with mean 0 and 100 SD
 
 
}
model {
  b0~normal(0,2);
  beta_trt[N_trt_groups-1]~normal(0,2);
  lkj_corr ~ lkj_corr_cholesky(5); 
  alpha_site_raw~std_normal();
  
  resp ~ bernoulli_logit( X_matrix * beta_trt + Z_matrix*b0_site);
}

generated quantities {
    
    vector[N_trt_groups-1] pred_prob_trt;
        pred_prob_trt = inv_logit(mean(b0_site) + beta_trt);


}

beta_trt[3] is the only one having proper prior, and thus only one having proper (marginal) posterior. MCMC doesn’t converge in case of improper posteriors

Wonderful - that solved it! Many thanks.

All rhats are behaving the same now.

1 Like