Vectorization and improving speed for a complex mixture type latent trait model for response times

Here’s a partly vectorized (annoyingly normal_lpdf doesn’t have a non-reducing variant and log_sum_exp doesn’t have a row-wise variant for matrix inputs) and less-compute-redundant version of your model block for loop:

  vector[n_obs] p_t = item[i_loc[,2] - person[p_loc,1];   
  vector[n_obs] p_c = item[i_loc[,2] - person[p_loc,2]; 
      
  vector[n_obs] one_div_exp_item_loc = 1/exp(item[i_loc,1]) ;

  vector[n_obs] norm_p_t ;
  vector[n_obs] norm_p_c;
  for (i in 1:n_obs) {
    norm_p_t[i] = normal_lpdf(Y[i] | p_t[i] , one_div_exp_item_loc[i]);  
    norm_p_c[i] = normal_lpdf(Y[i] | p_c[i] , one_div_exp_item_loc[i]);  
  }

  vector[n_obs] log1m_pC_loc = log1m(pC[i_loc]) ;
  vector[n_obs] log_pC_loc = log(pC[i_loc]) ;
  vector[n_obs] log1m_pH_loc = log1m(pH[i_loc]) ;
  vector[n_obs] log_pH_loc = log(pH[i_loc]) ;

  matrix[n_obs,4] lprt;
  lprt[,1] = log1m_pC_loc + log1m_pH_loc + norm_p_t;  // T = 0, C=0
  lprt[,2] = log1m_pC_loc + log_pH_loc   + norm_p_t;  // T = 1, C=0
  lprt[,3] = log_pC_loc   + log1m_pH_loc + norm_p_t;  // T = 0, C=1
  lprt[,4] = log_pC_loc   + log_pH_loc   + norm_p_c;  // T = 1, C=1 
  vector[n_obs] lps;
  for (i in 1:n_obs) {
    lps[i] = log_sum_exp(lprt[i,]) ;
  }
  target += lps ;

It’s possible to hack a non-reducing normal_lpdf (inspiration) yourself via:


  vector[n_obs] norm_p_t = (
    // std-normal after shifting/scaling
    -0.5*pow(
      (Y-p_t)/one_div_exp_item_loc
      , 2
    ) 
    // jacobian for division:
    - log(one_div_exp_item_loc)
  ) ;

  vector[n_obs] norm_p_c = (
    // std-normal after shifting/scaling
    -0.5*pow(
      (Y-p_c)/one_div_exp_item_loc
      , 2
    ) 
    // jacobian for division:
    - log(one_div_exp_item_loc)
  ) ;

But I’m not confident that would be any more performant than the loop above.

1 Like