How to manually code unit simplexes in Stan?

CerulloE · October 29, 2022, 2:38pm

I have a model which requires multiple unit simplexes but each of different lengths

Rather than creating them manually like this:

parameters { 
simplex[length_1] theta_1;
simplex[length_2] theta_2;
// .... etc
}

I want to make the program general and work for any number N inputs (length_1,…,length_N)

However this would require a “ragged array” which isnt supported in Stan.

A solution would be to iteratively construct N simplexes of lengths = (length_1,…,length_N), by making an array vector[max(lengths)] simplexes[N] and then constructing each one manually: simplexes[n, 1:lengths[n]] = ..... but this would require me to manually code the simplexes and their jacobian adjustments (since I want to put dirichlet priors on them)

Could anyone provide some guidance on how to code this?

Thanks

ahartikainen · October 29, 2022, 4:18pm

Hi, what if you created your stan code programmatically?

CerulloE · October 30, 2022, 12:25am

Sorry I’m not sure what you mean

ahartikainen · October 30, 2022, 8:48am

So instead of creating a ragged array of simplexes you could create your stan code with some string manipulation

E.g. create a function that takes n as an argument and outputs a stan code with n simplexes.

You would just need to figure what parts need cloning and what are constant.

CerulloE · October 30, 2022, 9:34pm

Could you give me an example?

Would I be able to do this through R?

Are you saying that rather than just having a fixed Stan code, I make a function in R that writes the Stan program, with all the parts that don’t rely on the N simplexes staying constant and the parts that do rely on the N simplexes changing depending on the input N?

Update:

I have created an R program which does what I want (pasted below) - however eventhough the output in the console is correct, it doesn’t seem to save the stan_model object correctly and just outputs as “character(0)” - what’s going on?


N <- 1
n_binary_tests <- 1
n_ordinal_tests <- 3
n_tests <- n_binary_tests + n_ordinal_tests
Thr = c(1,5,10,25)
ord_test_index <- 1:n_ordinal_tests
dirichlet_prior_vec_d_1 <- rep(1,5)
dirichlet_prior_vec_d_2 <- rep(1,10)
dirichlet_prior_vec_d_3 <- rep(1,25)
dirichlet_prior_vec_nd_1 <- rep(1,5)
dirichlet_prior_vec_nd_2 <- rep(1,10)
dirichlet_prior_vec_nd_3 <- rep(1,25)

stan_model <- paste(cat("\t","data {
                      int n_binary_tests; // number of binary tests
                      int n_ordinal_tests; // number ordinal of tests
                      int<lower=1> n_tests; // number of tests
                      int<lower=1> N; // number of individuals 
                      int<lower=1> Thr[n_tests]; // number of thresholds for each test
                      int<lower=0> y[N, n_tests]; // N individuals and n_tests tests, n_studies studies 
                    //  int r[choose(n_tests,2), 4];
                    //  int numg;
                      real se_bin_mu;
                      real se_bin_sd;
                      real sp_bin_mu;
                      real sp_bin_sd;
                }  
parameters {
                      vector[n_tests] mu[2]; 
                      real<lower=0,upper=1> u_d[N,n_tests]; // nuisance that absorbs inequality constraints
                      real<lower=0,upper=1> u_nd[N,n_tests]; // nuisance that absorbs inequality constraints
                      cholesky_factor_corr[n_tests] L_Omega_nd;
                      cholesky_factor_corr[n_tests] L_Omega_d;
                      real<lower=0,upper=1> p; // disease prevalence  
                      vector[n_tests] C_1_d;
                      vector[n_tests] C_1_nd;
                      vector<lower=0>[n_tests] c_scale_d;
                      vector<lower=0>[n_tests] c_scale_nd;
    \n",
    # create simplexes
    paste0("\t","simplex[", Thr[(n_binary_tests+ord_test_index):n_tests],"]",
           " theta_d",ord_test_index,"\n"),
    paste0("\t","simplex[", Thr[(n_binary_tests+ord_test_index):n_tests],"]",
           " theta_nd",ord_test_index,"\n"),
    "\t", "}",
    "
transformed parameters { 
                      vector[N] log_lik; 
                      vector[max(Thr)] C_d_vec[n_ordinal_tests]; 
                      vector[max(Thr)] C_nd_vec[n_ordinal_tests]; 
    ",
    # create ordered vectors by multiplying simplex by scales 
    paste0("\t","
           C_d_vec[",ord_test_index,",1:",Thr[n_binary_tests+ord_test_index],"] = comulative_sum(",
           "theta_d", ord_test_index,") * c_scale_d[",ord_test_index,"];",
           "\n"),
    paste0("\t","
           C_nd_vec[",ord_test_index,",1:",Thr[n_binary_tests+ord_test_index],"] = comulative_sum(",
           "theta_d", ord_test_index,") * c_scale_d[",ord_test_index,"];",
           "\n"),
    "{

      // Parameters for likelihood function. Based on code upoaded by Ben Goodrich which uses the 
      // GHK algorithm for generating TruncMVN. See: https://github.com/stan-dev/example-models/blob/master/misc/multivariate-probit/probit-multi-good.stan#L11
          for (n in 1:N) {
              real lp1;
              real lp0;
              vector[n_tests] z_d;
              vector[n_tests] z_nd;
              vector[n_tests] y1d;
              vector[n_tests] y1nd;
              real prev_d = 0;
              real prev_nd = 0;
            for (t in 1:n_tests) { 
             if (Thr[t] == 1) { //// Binary Likelihoods (tests w/ 1 threshold) - make sure binary tests are at the start 
                    real u_d1 =   u_d[n,t];
                    real u_nd1 = u_nd[n,t];
                    real bound_d_bin;  
                    real bound_nd_bin;
                    bound_d_bin  = phi_logit_approx(  -(  (mu[1,t])  + prev_d ) / L_Omega_d[t,t] ); 
                    bound_nd_bin = phi_logit_approx(  -(  (mu[2,t]) + prev_nd )  / L_Omega_nd[t,t] );
                      if (y[n,t] == 1) {
                        z_d[t]   = inv_phi_logit_approx(bound_d_bin + (1 - bound_d_bin)*u_d1);      
                        z_nd[t]  = inv_phi_logit_approx(bound_nd_bin + (1 - bound_nd_bin)*u_nd1);    
                        y1d[t]   = log1m(bound_d_bin);  // Jacobian adjustment
                        y1nd[t]  = log1m(bound_nd_bin); // Jacobian adjustment
                      }
                      else {  // y == 0
                        z_d[t]   = inv_phi_logit_approx(bound_d_bin*u_d1);
                        z_nd[t]  = inv_phi_logit_approx(bound_nd_bin*u_nd1);
                        y1d[t]   = log(bound_d_bin);  // Jacobian adjustment
                        y1nd[t]  = log(bound_nd_bin); // Jacobian adjustment
                      }
                                 //  if (t < n_binary_tests)   prev_d    = L_Omega_d[t+1,1:t] * head(z_d,t);  
                                 //  if (t < n_binary_tests)   prev_nd  = L_Omega_nd[t+1,1:t] * head(z_nd,t);
            }
              else {  // ordinal likelihoods (tests w/ > 1 threshold)               
                    real u_d1 =   u_d[n,t];
                    real u_nd1 = u_nd[n,t];
                    vector[Thr[t]] bound_d; 
                    vector[Thr[t]] bound_nd;
                    bound_d[1:Thr[t]]  =  phi_logit_approx_vec( ( C_d_vec[t-n_binary_tests,1:Thr[t]]   - ( mu[1,t]  + prev_d  ) ) /   L_Omega_d[t,t]  ); // Se
                    bound_nd[1:Thr[t]] =  phi_logit_approx_vec( ( C_nd_vec[t-n_binary_tests,1:Thr[t]]  - ( mu[2,t]  + prev_nd ) ) /   L_Omega_nd[t,t]   ); // FP
                    {
                     int K =   (Thr[t] + 1) ; 
                        if (y[n,t] == K) {
                          z_d[t]   = inv_phi_logit_approx(bound_d[K-1] + (1 - bound_d[K-1])*u_d1);      
                          z_nd[t]  = inv_phi_logit_approx(bound_nd[K-1] + (1 - bound_nd[K-1])*u_nd1);    
                          y1d[t]   = log1m(bound_d[K-1]);  // Jacobian adjustment
                          y1nd[t]  = log1m(bound_nd[K-1]); // Jacobian adjustment
                        }
                       for (J in 2:(K-1)) {
                        if (y[n,t] == J) {
                          z_d[t]   = inv_phi_logit_approx(bound_d[J-1] + (bound_d[J] - bound_d[J-1])*u_d1);      
                          z_nd[t]  = inv_phi_logit_approx(bound_nd[J-1] + (bound_nd[J] - bound_nd[J-1])*u_nd1);    
                          y1d[t]   = log(  (bound_d[J] - bound_d[J-1]) );  // Jacobian adjustment
                          y1nd[t]  = log( (bound_nd[J] - bound_nd[J-1]) );  // Jacobian adjustment
                        }
                       }
                        if (y[n,t] == 1) {
                        z_d[t]   = inv_phi_logit_approx(bound_d[1]* u_d1);
                        z_nd[t]  = inv_phi_logit_approx(bound_nd[1]* u_nd1);
                        y1d[t]   = log(bound_d[1]); // Jacobian adjustment
                        y1nd[t]  = log(bound_nd[1]); // Jacobian adjustment
                      }
                    }
              }
                             if (t < n_tests)   prev_d    = L_Omega_d[t+1,1:t] * head(z_d,t);  
                             if (t < n_tests)   prev_nd  = L_Omega_nd[t+1,1:t] * head(z_nd,t);
              }
                            // Jacobian adjustments imply z is truncated standard normal
                            // thus utility --- mu + L_Omega * z --- is truncated multivariate normal
                            lp1  = sum(y1d)  +   bernoulli_lpmf(1 | p); 
                            lp0  = sum(y1nd) +   bernoulli_lpmf(0 | p); 
                           log_lik[n] =  log_sum_exp(lp1 , lp0); 

            }

          } 
model {
          
          C_1_d ~ normal(0,1);
          C_1_nd ~ normal(0,1);",
    paste0("\t","
           theta_d",ord_test_index," ~ dirichlet(","dirichlet_prior_vec_d_",ord_test_index,");",
           "\n"),
    "           c_scale_d ~ normal(0,5);
           c_scale_nd ~ normal(0,5);
          
          mu[1,1] ~ normal(se_bin_mu, se_bin_sd); 
          mu[2,1] ~ normal(sp_bin_mu, sp_bin_sd);  
          
          for (t in 2:n_tests) 
             mu[, t] ~ normal(0,1);

          L_Omega_d  ~ lkj_corr_cholesky(4);
          L_Omega_nd ~ lkj_corr_cholesky(4);

            for (n in 1:N) 
               target +=  log_lik[n]; 
}
    
generated quantities {
          vector[n_binary_tests] Se_bin; 
          vector[n_binary_tests] Sp_bin; 
          vector[max(Thr)] Se_ord[n_ordinal_tests]; 
          vector[max(Thr)] Sp_ord[n_ordinal_tests]; 
          corr_matrix[n_tests] Omega_nd;
          corr_matrix[n_tests] Omega_d;
      
            Omega_d  =  multiply_lower_tri_self_transpose(L_Omega_d);
            Omega_nd =  multiply_lower_tri_self_transpose(L_Omega_nd);
      
      
            //  Estimates for binary tests
         for (t in 1:n_binary_tests) {
            Se_bin[t]  =        phi_logit_approx( mu[1,t]  );
            Sp_bin[t]  =    1 - phi_logit_approx( mu[2,t]  );
         }
         
            //  Estimates for ordinal tests
         for (t in 1:n_ordinal_tests) {
           for (k in 1:Thr[t+ n_binary_tests]) {
            Se_ord[t,k]  =     1-   phi_logit_approx( C_d_vec[t,k]  -  mu[1,t + n_binary_tests]  );
            Sp_ord[t,k]  =     phi_logit_approx( C_nd_vec[t,k] -  mu[2,t + n_binary_tests]  );
           }
         }
   

  }"))

stan_model

ahartikainen · October 31, 2022, 11:03pm

Very cool

Make sure that all the individual pieces create the string wanted and after that try cat them together by adding one by one a new piece.

I wonder if you have a bunch of strings and then a list of strings, maybe that needs to be transformed to a string so that cat works correctly?

CerulloE · November 15, 2022, 9:27pm

Does anyone have any idea for this? I would prefer a solution that is all done in a single Stan program as opposed to generating Stan code using string manipulation as @ahartikainen suggested

In the Stan manual (10.7 Unit Simplex | Stan Reference Manual) it shows how Stan creates unit simplexes under the hood but i’m not sure how to code this for my purposes.

ahartikainen · November 15, 2022, 10:31pm

Have you check the manual for ragged arrays?

WardBrian · November 16, 2022, 12:07am

@mjhajharia was an intern for @Bob_Carpenter this summer working on different transforms, one of which was the simplex. They implemented Stan’s stick breaking method and jacobian adjustment as part of a stan model here:

github.com

mjhajharia/transforms/blob/main/transforms/simplex/Stickbreaking.stan

data {
  int<lower=0> N;
  vector<lower=0>[N] alpha;
}
parameters {
  vector[N - 1] y;
}
transformed parameters {
  vector[N-1] z = inv_logit(y[1:N - 1] - log(reverse(linspaced_vector(N - 1, 1, N - 1))));
  simplex[N] x;
  x[1] = z[1];
  real cum_sum = 0;
  for (n in 2:N - 1) {
    cum_sum += x[n - 1];
    x[n] = (1 - cum_sum) * z[n];
  }   
  x[N] = 1 - (cum_sum+x[N-1]);
}
model {
 target += log(z[1:N - 1]) + log1m(z[1:N - 1]) + log1m(cumulative_sum(append_row(0, x[1:N - 2])));

This file has been truncated. show original

mjhajharia · November 17, 2022, 5:20pm

yeah this should work, just make sure you replace the target_density_lp with what you’d want specifically, in our case it’s just dirichlet

CerulloE · November 30, 2022, 3:00pm

Thanks - i’m having trouble understanding why x is declared as a simplex in transformed parameters - I want to construct one manually so without using simplex at all in the code - can’t x just be called as a vector in transformed parameters and this would still work, right?

WardBrian · November 30, 2022, 3:09pm

Correct. Outside of parameters, the constrained types only serve as sanity checks rather than actual transforms.

mjhajharia · December 23, 2022, 12:43am

yeah sanity checks exactly. you wouldn’t really run into issues for canonical transforms like ALR. but if you’re tweaking things yourself and constructing a mapping it might be useful to keep that, if nothing it helps keep track of what’s happening for a large model.

Topic		Replies	Views
Help with manually constructing ordered vectors Modeling	0	301	November 16, 2022
Following up on several discussions of simplex adjustments and ragged arrays of simplexes Modeling techniques , specification , user-defined-functions , constraint-transform	13	196	December 10, 2024
How to add constraints to Stan model Modeling	6	652	February 10, 2024
Defining an array of matrices (or vectors) of variable length Modeling specification	2	1307	October 27, 2020
Large data sets with stan code Modeling	1	688	September 18, 2018

How to manually code unit simplexes in Stan?

Related topics