Interaction of two monotonic effects

I’m trying to figure out how to deal with the coefficients interaction of two ordinal predictors. Specifically, I’m trying to adapt brms’s monotonic effects approach, so that increasing the level of either predictor will increase the proportion of the interaction’s coefficient.

Let the interaction term between level i of predictor 1 and level j of predictor 2 be
\eta_{i,j} = b \xi_{i,j}, where
\Xi is a I by J matrix in which \xi_{1,1} = 0, \xi_{I,J} = 1, \xi_{i,j} \in (0,1) for all other i and j, \xi_{i,j} < \xi_{i+1,j}, and \xi_{i,j} < \xi_{i,j+1}. In other words, it’s a matrix of proportions that increases along both rows and columns. I’m not sure if there’s a formal name for this type of matrix, but if anyone knows one I’d love to hear it.

Does this seem like a reasonable way to construct a monotonic interaction?

Secondly, does anyone have an idea for how to efficiently construct this sort of matrix in Stan? I’ve written a function for it, but I feel like it’s horribly inefficient.

Essentially, it

  1. Assembles a vector of proportions into a matrix starting with 0 and ending with 1
  2. Uses a stick breaking construction on each column to create a simplex (except for the last row), then cumulative sum
  3. Builds a cumulatively summed simplex on the last row
  4. Starting with xi[nrow-1, ncol-1] then moving up and left, multiplies each cell by the minimum of the cell to the left and below.

I used a log-sum-exp based soft minimum function in hopes that it wouldn’t upset the auto-diff. Here’s the functions:

functions{
  real lse_min(real a, real b, real sharpness) {
    // lse_min will return a number slightly lower than min(a,b), but is smooth
    // as sharpness increases, lse_min(x,y) -> min(x,y)
    return -log_sum_exp([-a * sharpness, -b * sharpness]) / sharpness;
  }
  matrix monotonic_interaction(vector probs, int nrow, int ncol) {
    // The size of probs should be nrow * ncol - 2;
    matrix[nrow, ncol] working = 
      to_matrix(
        append_row(append_row([0]', probs), [1]'), nrow, ncol, 1);
   matrix[nrow, ncol] out;
    // Stick breaking process to build out simplex, then cumsum each row
    
    { // part 1: assemble and cumsum simplices for each column and the last row
      real last_prob = 0;
      real acc = 0;
      real current;
      for(j in 1:ncol){
        for(i in 1: nrow-1) {
          // The new prob is the accumulated log1m probs plus new prob
          current = working[i,j];
          acc +=  exp(last_prob + log(current));
          working[i,j] = acc;
          out[i,j] = acc;
          last_prob += log1m(current);
        }
        last_prob = 0;
        acc = 0;
      }
      // last row is its own simplex; note that we're not assigning to working here
      for(j in 1:ncol-1) {
        current = working[nrow,j];
        acc +=  exp(last_prob + log(current));
        out[nrow,j] = acc;
        last_prob += log1m(current);
      }
      // two primary corners should be 0 and 1
      out[1,1] = 0;
      out[nrow,ncol] = 1;
      working[1,1] = 0;
      working[nrow,ncol] = 1;
    }
    { // Part 2: make sure that A[i,j] < {A[i+1,j], A[i, j+1]}
      real right;
      real below;
      for(j in 1:ncol-1) for (i in 1:nrow-1) {
        // nrow-i will seq down from nrow-1 to 1 (same for ncol)
         right = out[nrow-i+1, ncol-j];
         below = out[nrow-i, ncol-j+1];
         // the cell in question must always be lower than the cells below and to the right
         out[nrow-i, ncol-j] = working[nrow-i, ncol-j] * 
            lse_min(right, below, 1000);
      }
    }
   return out;
  }
}

I haven’t had a chance to test this out yet, but I’d love to hear any thoughts on the concept.