Using BRMS Functionality for Data Generated in CmdStan

  • brms Version: 2.5.0

I’m relatively new to Stan, and BRMS has been a lifesaver for coding mixture models. However, now I’m working on a computing cluster, and BRMS isn’t working very well. So I have a silly workaround: I use BRMS make_stancode and make_standata to create my Stan code and data. Then I run my models in parallel on the cluster using CmdStan. Then I read the resulting csv files back into R using read_stan_csv.

This works great, but now I’d like to use functions like loo.brmsfit and hypothesis (to compare models with different number of mixture components).

Is there any way to (in order of preference):

  1. Create a brmsfit object from the rstanfit object created by read_stan_csv (and the model code and data)?
  2. Compute LOOIC using the output of my Stan model? Currently, it looks like you need to have a variable log_lik in your stan code, which BRMS doesn’t generate automatically.
  3. Modify the Stan model to add a log_lik so I can then compute LOOIC?

For the last, in the example below, would it be sufficient to add the following to amend the generated quantities block as follows?

generated quantities {
  vector[N] log_lik;
  for (n in 1:N){
      log_lik[n] = log_sum_exp(
        log(theta1) + bernoulli_logit_lpmf(Y[n] | mu1[n]),
        log(theta2) + bernoulli_logit_lpmf(Y[n] | mu2[n])
      );
   }
}

Full example Stan code, generated by BRMS, pasted below.

// generated with brms 2.5.0
functions { 
} 
data { 
  int<lower=1> N;  // total number of observations 
  int Y[N];  // response variable 
  int<lower=1> K_mu1;  // number of population-level effects 
  matrix[N, K_mu1] X_mu1;  // population-level design matrix 
  int<lower=1> K_mu2;  // number of population-level effects 
  matrix[N, K_mu2] X_mu2;  // population-level design matrix 
  vector[2] con_theta;  // prior concentration 
  // data for group-level effects of ID 1
  int<lower=1> J_1[N];
  int<lower=1> N_1;
  int<lower=1> M_1;
  vector[N] Z_1_mu1_1;
  // data for group-level effects of ID 2
  int<lower=1> J_2[N];
  int<lower=1> N_2;
  int<lower=1> M_2;
  vector[N] Z_2_mu2_1;
  int prior_only;  // should the likelihood be ignored? 
} 
transformed data { 
  int Kc_mu1 = K_mu1 - 1; 
  matrix[N, K_mu1 - 1] Xc_mu1;  // centered version of X_mu1 
  vector[K_mu1 - 1] means_X_mu1;  // column means of X_mu1 before centering 
  int Kc_mu2 = K_mu2 - 1; 
  matrix[N, K_mu2 - 1] Xc_mu2;  // centered version of X_mu2 
  vector[K_mu2 - 1] means_X_mu2;  // column means of X_mu2 before centering 
  for (i in 2:K_mu1) { 
    means_X_mu1[i - 1] = mean(X_mu1[, i]); 
    Xc_mu1[, i - 1] = X_mu1[, i] - means_X_mu1[i - 1]; 
  } 
  for (i in 2:K_mu2) { 
    means_X_mu2[i - 1] = mean(X_mu2[, i]); 
    Xc_mu2[, i - 1] = X_mu2[, i] - means_X_mu2[i - 1]; 
  } 
} 
parameters { 
  vector[Kc_mu1] b_mu1;  // population-level effects 
  vector[Kc_mu2] b_mu2;  // population-level effects 
  simplex[2] theta;  // mixing proportions 
  ordered[2] ordered_Intercept;  // to identify mixtures 
  vector<lower=0>[M_1] sd_1;  // group-level standard deviations
  vector[N_1] z_1[M_1];  // unscaled group-level effects
  vector<lower=0>[M_2] sd_2;  // group-level standard deviations
  vector[N_2] z_2[M_2];  // unscaled group-level effects
} 
transformed parameters { 
  // identify mixtures via ordering of the intercepts 
  real temp_mu1_Intercept = ordered_Intercept[1]; 
  // identify mixtures via ordering of the intercepts 
  real temp_mu2_Intercept = ordered_Intercept[2]; 
  // mixing proportions 
  real<lower=0,upper=1> theta1 = theta[1]; 
  real<lower=0,upper=1> theta2 = theta[2]; 
  // group-level effects 
  vector[N_1] r_1_mu1_1 = sd_1[1] * (z_1[1]);
  // group-level effects 
  vector[N_2] r_2_mu2_1 = sd_2[1] * (z_2[1]);
} 
model { 
  vector[N] mu1 = temp_mu1_Intercept + Xc_mu1 * b_mu1;
  vector[N] mu2 = temp_mu2_Intercept + Xc_mu2 * b_mu2;
  for (n in 1:N) { 
    mu1[n] += r_1_mu1_1[J_1[n]] * Z_1_mu1_1[n];
    mu2[n] += r_2_mu2_1[J_2[n]] * Z_2_mu2_1[n];
  } 
  // priors including all constants 
  target += student_t_lpdf(temp_mu1_Intercept | 3, 0, 10); 
  target += student_t_lpdf(temp_mu2_Intercept | 3, 0, 10); 
  target += dirichlet_lpdf(theta | con_theta); 
  target += student_t_lpdf(sd_1 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10); 
  target += normal_lpdf(z_1[1] | 0, 1);
  target += student_t_lpdf(sd_2 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10); 
  target += normal_lpdf(z_2[1] | 0, 1);
  // likelihood including all constants 
  if (!prior_only) { 
    for (n in 1:N) {
      real ps[2];
      ps[1] = log(theta1) + bernoulli_logit_lpmf(Y[n] | mu1[n]);
      ps[2] = log(theta2) + bernoulli_logit_lpmf(Y[n] | mu2[n]);
      target += log_sum_exp(ps);
    }
  } 
} 
generated quantities { 
  // actual population-level intercept 
  real b_mu1_Intercept = temp_mu1_Intercept - dot_product(means_X_mu1, b_mu1); 
  // actual population-level intercept 
  real b_mu2_Intercept = temp_mu2_Intercept - dot_product(means_X_mu2, b_mu2); 
} 

Thanks

Option (1) should actually not be too complicated:

model <- brm(<your arguments>, chains = 0)
model$fit <- read_stan_csv(...)
model <- brms:::rename_pars(model)
summary(model)
4 Likes

I am in the same situation and this solution works, but it compiles the model which takes quite a while. Is there perhaps a faster but more complicated work around? From what I can tell, creating the brmsfit object doesn’t require this step. Many thanks!

Ok, I get it. Basically, we need to create a brmsfit object with all the stuff in it but without the “fit” object, which is to be added later anyway. Is my understanding correct? If so, this may be a reasonable feature request. Would you mind opening an issue about this on https://github.com/paul-buerkner/brms?

That’s right. Thanks! I’ll make the feature request.