How to manually code unit simplexes in Stan?

I have a model which requires multiple unit simplexes but each of different lengths

Rather than creating them manually like this:

parameters { 
simplex[length_1] theta_1;
simplex[length_2] theta_2;
// .... etc
}

I want to make the program general and work for any number N inputs (length_1,…,length_N)

However this would require a “ragged array” which isnt supported in Stan.

A solution would be to iteratively construct N simplexes of lengths = (length_1,…,length_N), by making an array vector[max(lengths)] simplexes[N] and then constructing each one manually: simplexes[n, 1:lengths[n]] = ..... but this would require me to manually code the simplexes and their jacobian adjustments (since I want to put dirichlet priors on them)

Could anyone provide some guidance on how to code this?

Thanks

Hi, what if you created your stan code programmatically?

Sorry I’m not sure what you mean

So instead of creating a ragged array of simplexes you could create your stan code with some string manipulation

E.g. create a function that takes n as an argument and outputs a stan code with n simplexes.

You would just need to figure what parts need cloning and what are constant.

Could you give me an example?

Would I be able to do this through R?

Are you saying that rather than just having a fixed Stan code, I make a function in R that writes the Stan program, with all the parts that don’t rely on the N simplexes staying constant and the parts that do rely on the N simplexes changing depending on the input N?

Update:

I have created an R program which does what I want (pasted below) - however eventhough the output in the console is correct, it doesn’t seem to save the stan_model object correctly and just outputs as “character(0)” - what’s going on?


N <- 1
n_binary_tests <- 1
n_ordinal_tests <- 3
n_tests <- n_binary_tests + n_ordinal_tests
Thr = c(1,5,10,25)
ord_test_index <- 1:n_ordinal_tests
dirichlet_prior_vec_d_1 <- rep(1,5)
dirichlet_prior_vec_d_2 <- rep(1,10)
dirichlet_prior_vec_d_3 <- rep(1,25)
dirichlet_prior_vec_nd_1 <- rep(1,5)
dirichlet_prior_vec_nd_2 <- rep(1,10)
dirichlet_prior_vec_nd_3 <- rep(1,25)

stan_model <- paste(cat("\t","data {
                      int n_binary_tests; // number of binary tests
                      int n_ordinal_tests; // number ordinal of tests
                      int<lower=1> n_tests; // number of tests
                      int<lower=1> N; // number of individuals 
                      int<lower=1> Thr[n_tests]; // number of thresholds for each test
                      int<lower=0> y[N, n_tests]; // N individuals and n_tests tests, n_studies studies 
                    //  int r[choose(n_tests,2), 4];
                    //  int numg;
                      real se_bin_mu;
                      real se_bin_sd;
                      real sp_bin_mu;
                      real sp_bin_sd;
                }  
parameters {
                      vector[n_tests] mu[2]; 
                      real<lower=0,upper=1> u_d[N,n_tests]; // nuisance that absorbs inequality constraints
                      real<lower=0,upper=1> u_nd[N,n_tests]; // nuisance that absorbs inequality constraints
                      cholesky_factor_corr[n_tests] L_Omega_nd;
                      cholesky_factor_corr[n_tests] L_Omega_d;
                      real<lower=0,upper=1> p; // disease prevalence  
                      vector[n_tests] C_1_d;
                      vector[n_tests] C_1_nd;
                      vector<lower=0>[n_tests] c_scale_d;
                      vector<lower=0>[n_tests] c_scale_nd;
    \n",
    # create simplexes
    paste0("\t","simplex[", Thr[(n_binary_tests+ord_test_index):n_tests],"]",
           " theta_d",ord_test_index,"\n"),
    paste0("\t","simplex[", Thr[(n_binary_tests+ord_test_index):n_tests],"]",
           " theta_nd",ord_test_index,"\n"),
    "\t", "}",
    "
transformed parameters { 
                      vector[N] log_lik; 
                      vector[max(Thr)] C_d_vec[n_ordinal_tests]; 
                      vector[max(Thr)] C_nd_vec[n_ordinal_tests]; 
    ",
    # create ordered vectors by multiplying simplex by scales 
    paste0("\t","
           C_d_vec[",ord_test_index,",1:",Thr[n_binary_tests+ord_test_index],"] = comulative_sum(",
           "theta_d", ord_test_index,") * c_scale_d[",ord_test_index,"];",
           "\n"),
    paste0("\t","
           C_nd_vec[",ord_test_index,",1:",Thr[n_binary_tests+ord_test_index],"] = comulative_sum(",
           "theta_d", ord_test_index,") * c_scale_d[",ord_test_index,"];",
           "\n"),
    "{

      // Parameters for likelihood function. Based on code upoaded by Ben Goodrich which uses the 
      // GHK algorithm for generating TruncMVN. See: https://github.com/stan-dev/example-models/blob/master/misc/multivariate-probit/probit-multi-good.stan#L11
          for (n in 1:N) {
              real lp1;
              real lp0;
              vector[n_tests] z_d;
              vector[n_tests] z_nd;
              vector[n_tests] y1d;
              vector[n_tests] y1nd;
              real prev_d = 0;
              real prev_nd = 0;
            for (t in 1:n_tests) { 
             if (Thr[t] == 1) { //// Binary Likelihoods (tests w/ 1 threshold) - make sure binary tests are at the start 
                    real u_d1 =   u_d[n,t];
                    real u_nd1 = u_nd[n,t];
                    real bound_d_bin;  
                    real bound_nd_bin;
                    bound_d_bin  = phi_logit_approx(  -(  (mu[1,t])  + prev_d ) / L_Omega_d[t,t] ); 
                    bound_nd_bin = phi_logit_approx(  -(  (mu[2,t]) + prev_nd )  / L_Omega_nd[t,t] );
                      if (y[n,t] == 1) {
                        z_d[t]   = inv_phi_logit_approx(bound_d_bin + (1 - bound_d_bin)*u_d1);      
                        z_nd[t]  = inv_phi_logit_approx(bound_nd_bin + (1 - bound_nd_bin)*u_nd1);    
                        y1d[t]   = log1m(bound_d_bin);  // Jacobian adjustment
                        y1nd[t]  = log1m(bound_nd_bin); // Jacobian adjustment
                      }
                      else {  // y == 0
                        z_d[t]   = inv_phi_logit_approx(bound_d_bin*u_d1);
                        z_nd[t]  = inv_phi_logit_approx(bound_nd_bin*u_nd1);
                        y1d[t]   = log(bound_d_bin);  // Jacobian adjustment
                        y1nd[t]  = log(bound_nd_bin); // Jacobian adjustment
                      }
                                 //  if (t < n_binary_tests)   prev_d    = L_Omega_d[t+1,1:t] * head(z_d,t);  
                                 //  if (t < n_binary_tests)   prev_nd  = L_Omega_nd[t+1,1:t] * head(z_nd,t);
            }
              else {  // ordinal likelihoods (tests w/ > 1 threshold)               
                    real u_d1 =   u_d[n,t];
                    real u_nd1 = u_nd[n,t];
                    vector[Thr[t]] bound_d; 
                    vector[Thr[t]] bound_nd;
                    bound_d[1:Thr[t]]  =  phi_logit_approx_vec( ( C_d_vec[t-n_binary_tests,1:Thr[t]]   - ( mu[1,t]  + prev_d  ) ) /   L_Omega_d[t,t]  ); // Se
                    bound_nd[1:Thr[t]] =  phi_logit_approx_vec( ( C_nd_vec[t-n_binary_tests,1:Thr[t]]  - ( mu[2,t]  + prev_nd ) ) /   L_Omega_nd[t,t]   ); // FP
                    {
                     int K =   (Thr[t] + 1) ; 
                        if (y[n,t] == K) {
                          z_d[t]   = inv_phi_logit_approx(bound_d[K-1] + (1 - bound_d[K-1])*u_d1);      
                          z_nd[t]  = inv_phi_logit_approx(bound_nd[K-1] + (1 - bound_nd[K-1])*u_nd1);    
                          y1d[t]   = log1m(bound_d[K-1]);  // Jacobian adjustment
                          y1nd[t]  = log1m(bound_nd[K-1]); // Jacobian adjustment
                        }
                       for (J in 2:(K-1)) {
                        if (y[n,t] == J) {
                          z_d[t]   = inv_phi_logit_approx(bound_d[J-1] + (bound_d[J] - bound_d[J-1])*u_d1);      
                          z_nd[t]  = inv_phi_logit_approx(bound_nd[J-1] + (bound_nd[J] - bound_nd[J-1])*u_nd1);    
                          y1d[t]   = log(  (bound_d[J] - bound_d[J-1]) );  // Jacobian adjustment
                          y1nd[t]  = log( (bound_nd[J] - bound_nd[J-1]) );  // Jacobian adjustment
                        }
                       }
                        if (y[n,t] == 1) {
                        z_d[t]   = inv_phi_logit_approx(bound_d[1]* u_d1);
                        z_nd[t]  = inv_phi_logit_approx(bound_nd[1]* u_nd1);
                        y1d[t]   = log(bound_d[1]); // Jacobian adjustment
                        y1nd[t]  = log(bound_nd[1]); // Jacobian adjustment
                      }
                    }
              }
                             if (t < n_tests)   prev_d    = L_Omega_d[t+1,1:t] * head(z_d,t);  
                             if (t < n_tests)   prev_nd  = L_Omega_nd[t+1,1:t] * head(z_nd,t);
              }
                            // Jacobian adjustments imply z is truncated standard normal
                            // thus utility --- mu + L_Omega * z --- is truncated multivariate normal
                            lp1  = sum(y1d)  +   bernoulli_lpmf(1 | p); 
                            lp0  = sum(y1nd) +   bernoulli_lpmf(0 | p); 
                           log_lik[n] =  log_sum_exp(lp1 , lp0); 

            }

          } 
model {
          
          C_1_d ~ normal(0,1);
          C_1_nd ~ normal(0,1);",
    paste0("\t","
           theta_d",ord_test_index," ~ dirichlet(","dirichlet_prior_vec_d_",ord_test_index,");",
           "\n"),
    "           c_scale_d ~ normal(0,5);
           c_scale_nd ~ normal(0,5);
          
          mu[1,1] ~ normal(se_bin_mu, se_bin_sd); 
          mu[2,1] ~ normal(sp_bin_mu, sp_bin_sd);  
          
          for (t in 2:n_tests) 
             mu[, t] ~ normal(0,1);

          L_Omega_d  ~ lkj_corr_cholesky(4);
          L_Omega_nd ~ lkj_corr_cholesky(4);

            for (n in 1:N) 
               target +=  log_lik[n]; 
}
    
generated quantities {
          vector[n_binary_tests] Se_bin; 
          vector[n_binary_tests] Sp_bin; 
          vector[max(Thr)] Se_ord[n_ordinal_tests]; 
          vector[max(Thr)] Sp_ord[n_ordinal_tests]; 
          corr_matrix[n_tests] Omega_nd;
          corr_matrix[n_tests] Omega_d;
      
            Omega_d  =  multiply_lower_tri_self_transpose(L_Omega_d);
            Omega_nd =  multiply_lower_tri_self_transpose(L_Omega_nd);
      
      
            //  Estimates for binary tests
         for (t in 1:n_binary_tests) {
            Se_bin[t]  =        phi_logit_approx( mu[1,t]  );
            Sp_bin[t]  =    1 - phi_logit_approx( mu[2,t]  );
         }
         
            //  Estimates for ordinal tests
         for (t in 1:n_ordinal_tests) {
           for (k in 1:Thr[t+ n_binary_tests]) {
            Se_ord[t,k]  =     1-   phi_logit_approx( C_d_vec[t,k]  -  mu[1,t + n_binary_tests]  );
            Sp_ord[t,k]  =     phi_logit_approx( C_nd_vec[t,k] -  mu[2,t + n_binary_tests]  );
           }
         }
   

  }"))

stan_model

Very cool

Make sure that all the individual pieces create the string wanted and after that try cat them together by adding one by one a new piece.

I wonder if you have a bunch of strings and then a list of strings, maybe that needs to be transformed to a string so that cat works correctly?

1 Like

Does anyone have any idea for this? I would prefer a solution that is all done in a single Stan program as opposed to generating Stan code using string manipulation as @ahartikainen suggested

In the Stan manual (10.7 Unit Simplex | Stan Reference Manual) it shows how Stan creates unit simplexes under the hood but i’m not sure how to code this for my purposes.

Have you check the manual for ragged arrays?

1 Like

@mjhajharia was an intern for @Bob_Carpenter this summer working on different transforms, one of which was the simplex. They implemented Stan’s stick breaking method and jacobian adjustment as part of a stan model here:

2 Likes

yeah this should work, just make sure you replace the target_density_lp with what you’d want specifically, in our case it’s just dirichlet

Thanks - i’m having trouble understanding why x is declared as a simplex in transformed parameters - I want to construct one manually so without using simplex at all in the code - can’t x just be called as a vector in transformed parameters and this would still work, right?

Correct. Outside of parameters, the constrained types only serve as sanity checks rather than actual transforms.

2 Likes

yeah sanity checks exactly. you wouldn’t really run into issues for canonical transforms like ALR. but if you’re tweaking things yourself and constructing a mapping it might be useful to keep that, if nothing it helps keep track of what’s happening for a large model.