Prediction for a new observation in a new group


#1

In page 274 of Data Analysis Using Regression and Multilevel/Hierarchical Models Gelman and Hill discuss how to do predictions for a new observation in a new group. Is there any example out there that shows how to implement this in Stan using the generated quantities block?


#2

Hi, I don’t know of any specific examples, but you just use the values of the hyperparameters to draw new group-level predictors using the appropriate XXX_rng functions and then use those to directly compute the prediction. It is conceptually quite simple - are you struggling with the concept or with the actual implementation?


#3

Thanks, @martinmodrak. I was able to implement what you describe but I just wanted to take a look at someone else’s code just to make sure I was following best practices. This is the code that I wrote:

generated quantities { 
  int<lower=0,upper=1>  y_rep[N] ;
  int<lower=0,upper=1>  y_pred[pred_N] ;
  vector[N] log_lik;
  real new_sch;
  real new_neighborhood;
  real new_sch_x_neighborhood;

  for(n in 1:N){
     y_rep[n]=bernoulli_logit_rng(lambda[n]) ;
     log_lik[n] = bernoulli_logit_lpmf(y[n] | lambda[n]);
  }
  
  for(n in 1:pred_N){
    // nothing new
    if(pred_idx_sch[n] < new_sch_idx && 
       pred_idx_neighborhood[n]  < new_neighborhood_idx) {
        y_pred[n]=bernoulli_logit_rng(sch[pred_idx_sch[n]] + 
                                      neighborhood[pred_idx_neighborhood[n]]  +
                                      sch_x_neighborhood[pred_idx_sch_x_neighborhood[n]] + 
                                      pred_X_centered[n,] * beta);
        } else if(pred_idx_sch[n] == new_sch_idx &&                            //  new school
                  pred_idx_neighborhood[n]  < new_neighborhood_idx){ 
                          
                  new_sch = normal_rng(mu_sch, sigma_sch);
                  new_sch_x_neighborhood = normal_rng(0, sigma_sch_x_neighborhood);
                  
                  y_pred[n] = bernoulli_logit_rng(new_sch + 
                              neighborhood[pred_idx_neighborhood[n]]  +
                              new_sch_x_neighborhood + 
                              pred_X_centered[n,] * beta);
       } else if(pred_idx_sch[n]== new_sch_idx &&                             // both new
                 pred_idx_neighborhood[n]  == new_neighborhood_idx){
                 
                 
                 new_sch = normal_rng(mu_sch, sigma_sch);
                 new_neighborhood = normal_rng(0, sigma_neighborhood);
                 new_sch_x_neighborhood = normal_rng(0, sigma_sch_x_neighborhood);
                 y_pred[n]=bernoulli_logit_rng(new_sch + 
                                                new_neighborhood +
                                                new_sch_x_neighborhood + 
                                                pred_X_centered[n,] * beta);
                 
       }else{                                                                       //  new neighborhood school
                  new_neighborhood = normal_rng(0, sigma_neighborhood);
                  new_sch_x_neighborhood = normal_rng(0, sigma_sch_x_neighborhood);
                  y_pred[n]=bernoulli_logit_rng(sch[pred_idx_sch[n]]  + 
                                                new_neighborhood +
                                                new_sch_x_neighborhood + 
                                                pred_X_centered[n,] * beta);
       }
    
  }
}