Modelling monotonic binomial response

The following is intended to model a monotonically increasing binomial response. I am purposely not assuming any parametric form for the profile of the response. There is a horseshoe prior on q[2:N] to shrink small values to zero. The x are equally spaced and the zero lower bound b are summed to give a non-decreasing eta. I am most likely stuck with the small number of unique x values.

data{       
  int<lower=1> N;
  int y[N];          
  int n[N];    
  vector[N] x; 
}                  
transformed data{
  matrix[N,N] S;     
  for(i in 1:N){    
    for(j in 1:N){  
      if(i>=j) S[i,j]=1;
      else S[i,j]=0;        
    }                   
  }      
}      
parameters{             
  real v0;
  vector<lower=0>[N-1] b;
  vector<lower=0>[N-1] lam;  
  real<lower=0> tau;           
  real<lower=0> c2;     
}                      
transformed parameters{
  vector[N] eta;           
  vector[N] q;      
  vector[N-1] lamhat;
  for(i in 1:(N-1)){     
    lamhat[i] = c2 * lam[i]/sqrt(c2 + pow(tau, 2) * pow(lam[i], 2));
  }                                                                     
  q[1] = v0;
  q[2:N] = b * tau .* lamhat;
  eta = S * q;                   
}                 
model{        
  y ~ binomial_logit(n, eta);
  v0 ~ normal(0, 5);             
  b ~ normal(0, 1);     
  lam ~ cauchy(0, 1);  
  c2 ~ inv_gamma(2, 4);  
  tau ~ cauchy(0, 0.25);
}
generated quantities{
  vector[N] theta;
  theta = inv_logit(eta);
}

Using the following (totally arbitrary) data, the model samples ok and the parameter values appear to be mostly sensible (or at least what I would expect from this data). However, the predicted values for \theta_j = \text{inv_logit} (\eta_j) seem not to capture the flat portions of the curve well. Additionally, I do not have a lot of experience with these types of priors so I am not certain that it is correctly implemented.

y1 <- c(50,50,50,70,70,70)
x1 <- seq_along(y1)-1
n1 <- rep(100, length(x1))
dat <- list(N = length(x1), y = y1, n = n1, x = x1)
mod1 <- rstan::stan_model(file = "inst/stan/mono1.stan", auto_write = TRUE);
fit1 <- rstan::sampling(mod1,
                          data = dat,
                          chains = 1,thin = 1,iter = 6000,refresh = 1000,warmup = 1000,
                          control = list(adapt_delta = 0.97, max_treedepth = 12))

post <- rstan::extract(fit1)
pred <- sapply(1:dat$N, function(j){
  c(mu = mean(post$theta[,j]), quantile(post$theta[, j], probs = c(0.025, 0.975)))
})
par(las=1, bty="l", cex.axis=1, cex.lab=1, cex.main=1,
      xaxs="i", yaxs="i", mar = c(5, 5, 3, 5))
plot(x1, y1/n1, type = "l", ylim = c(0.3, 0.9))
lines(x1, pred["mu", ], col = 2)
lines(x1, pred["2.5%", ], lty = 2, col = 2)
lines(x1, pred["97.5%", ], lty = 2, col = 2)
points(x1, pred["mu", ], col = 2)

pic1

With increasing sample size, e.g. 300 for each arm (see below), the model dose a better job of recovering the true values, but it is still not ideal.

pic2

I am wondering if anyone can suggest a better alternative or perhaps sees an error in the implementation? I have had a look into monotonic splines, but so far I haven’t found anything that works any better.

1 Like

Hi, sorry for not responding earlier.

I offer few quick observations, but not complete solution - sorry.

  1. The issue would be much easier to resolve, if you worked with a simpler model where you have only the monotonic effect and nothing else.

  2. Maybe this is just a mismatch between the model and data: The model assumes monotonically increasing probability and so it has to have an imperfect fit to data that has non-increasing segments. It is usually sensible to create simulated data the matches the model exactly (draw all params from prior, compute transformed parameters, draw observations).

  3. If you haven’t read it already, the brms vignetter on monotonic effects might be helpful https://paul-buerkner.github.io/brms/articles/brms_monotonic.html . Note that brms uses the simplex type to define the “steps” of the monotonic increase.

Best of luck with your model!

1 Like

@martinmodrak thank you for the response and apologies about my own delay. The particular context I am considering is one where it is reasonable to assume monotonicity, but it isn’t guaranteed. In fact, there is a chance that we will see no association between dose and response. However, a shape constraint makes it easier to define a minimum non-inferior dose. Specifically, if we have \phi_k = \theta_k - \theta_K + \Delta \ge 0 for non-inferiority of dose k relative to dose K then if we assume monotonicity (and some additional assumptions at the boundaries) we can define the minimum non-inferior dose as x^\prime_k \triangleq \text{min} \{ x_k | \phi_k \ge 0 \land \phi_{k-1} <0 \}.

I have tried various simpler models (including the simplex approach per @paul.buerkner (and also @Bob_Carpenter I think) which I have used previously and have personally found to be a good way to model monotonicity) but have ultimately returned to a gaussian process, which seems to work well for a wide range of shapes and (predictably) is where I started with this problem. I did also implement a shape constrained gp as discussed by Riihimaki and @avehtari in their 2010 paper and I will post that later as I have a follow up question on the formulation of that model.

3 Likes

That is likely to have more bias for the constant parts (as demonstrated in the paper) than your horseshoe approach. To reduce bias there needs to be a lot of prior mass for functions with constant parts

1 Like