Precise but wrong results for a drug interaction model

Hello All,

I have been trying to do a simulation study of a two-stage model on drug interaction using Stan.

Here is the issue: while I can obtain reasonable results from the first stage, the estimate of the interaction term (which is estimated in the second stage) is biased most of the time. Sometimes (this happens to some parameter values chosen for the interaction term) when the interaction term is correctly estimated, some other parameter estimations turn out to be biased. (Kind of like a seesaw). Under both cases, though biasedness presents, the variance of the estimated parameters are sufficiently low.

The traceplots, posterior densities, and autocorrelation plots all look good. I did a log_prob() diagnosis, and found the log posterior at the true value to be -29692.97. Whereas, the log posterior value at the means of the posterior distributions to be 501.
.

I suspect the problem can be in one (or more) of the following:

  1. Data generation process;
  2. Newton’s method root-solver within Stan (most likely one I think!);
  3. Jacobian transformations somewhere.

In essence, the model description is as the following.

Stage One
Two mono-therapies are given to individuals with drug levels d_i (i = 1, 2). The drug effects are modeled by two independent Emax models,
y_i = \frac{y_{max, i} \cdot (\frac{d_i}{D_{m,i}})^{m_i}}{1 + (\frac{d_i}{D_{m,i}})^{m_i}}.
Further,
\log(\frac{y}{y_{max, i} -y}) = \beta_{0,i} + \beta_{1,i}*\log(d_i) + \epsilon,
where \beta_{0,i} = -m_i*\log(D_{m,i}), \beta_{1,i} = m_i and \epsilon \sim \mathcal{N} (0, \sigma_i).

Stage Two
When combining drug A and B, there is often an interaction effect. In Zhao et al. (2012) (link: https://www.tandfonline.com/doi/abs/10.1080/19466315.2012.707087?journalCode=usbr20), they model the interaction term using a response surface, which is my ultimate goal. For now, to make the problem simple, I’m only using a single constant as my interaction term, \tau.

The interaction term is modeled as the following,
\frac{d_1}{\exp(-\frac{\beta_{0,1}}{\beta_{1,1}})(\frac{y}{y_{max,1}-y} )^{1/\beta_{1,1}}} + \frac{d_2}{\exp(-\frac{\beta_{0,2}}{\beta_{1,2}})(\frac{y}{y_{max,2}-y} )^{1/\beta_{1,2}}} = exp(\tau).

Moreover, we assume y_{max,1} = y_{max,2} = y_{max} and z = \log( \frac{y}{y_{max}-y}), so we can rewrite the above equation into,
\frac{d_1}{\exp(-\frac{\beta_{0,1}}{\beta_{1,1}})(z )^{1/\beta_{1,1}}} + \frac{d_2}{\exp(-\frac{\beta_{0,2}}{\beta_{1,2}})(z )^{1/\beta_{1,2}}} = exp(\tau).

Since it is an implicit function about z, we solve z numerically using a Newton’s method in Stan once all the parameters are sampled in one iteration. The solved value, \hat{z}, becomes the mean of the distribution of \log( \frac{y}{y_{max}-y}), expressed in the following relationship,
\log \frac{y}{y_{max} - y} = \hat{z} + e,
where e \sim \mathcal{N}(0, \sigma^2).

I include the data generation functions, parameter values used, function calls and the stan file. I know it is a more involved problem, so I’d truly appreciate any bit of help.

LaModel_help.stan (3.0 KB)
LaModelCall_help.R (6.1 KB)
LaModelFunc_help.R (6.4 KB)

I’ll try to take a closer look at the problem, but first thing that comes to my mind: is there a reason why you are using a custom Newton solver, instead of Stan’s built-in algebraic solver?

That’ll be great, thanks. Yeah, I tried the Stan solver, and it didn’t work very well. I also tried bisection method, but Newton’s method is by far the most stable and fastest.

I see. Do you have more precise diagnostics? Was it slow, did it not converge, did it return inaccurate results?

I might add a built-in Newton solver, but before doing that, I would like to better understand how it would compare to Powell’s method (currently implemented in Stan), so any motivating problem is helpful.

I remembered that the Stan solver was sensitive about the initial values. And it was either taking forever to run (and most of time it just kept going if I didn’t stop it) and/or the sampling results had a lot divergent samples after warm-up. I didn’t take a close look at if the solver was producing accurate results, since the Newton’s one is just a lot faster. It’ll be cool to see a potential Newton’s solver get incorporated.

From a quick glance at your code, I gather you simulated data with R.

Have you tried simulating data with your Stan program, using the option algorithm = 'fixed_param'? To use this option, you need to specify the correct (or “true”) parameter values as your initial estimates.

You can then compare the data you simulate with your Stan model to what you simulated in R. You could also look at the data your model produces with Stan’s built-in algebraic solver, as a benchmark for your Newton solver. If all simulated data are in agreement, I then suggest fitting the Stan model to the data simulated with Stan.

If the bias persists, despite the simulation working fine, then this means we have an inference problem. In that case, there are various diagnostics we can look at. Note that, as you’ve pointed out, bias is a concern here because of the small variance. There are more formal diagnostics we can do to make sure the posterior as a whole correct, notably by using prior predictive checks.

Hope this helps and apologies for the delay in my reply.

It should be possible to externally interface with a solver.

http://mc-stan.org/rstan/articles/external.html

I thought about recently using a L-BFGS solver that way.

Also I wonder if the usage of a solver
not causes discontinuity in the gradient calculation and therefore causes some a problem in Stan
NUTS algorithm?

Do you have an specific example where that might be a concern?

I don’t want to hijack that posting.
Short answer: Having sensor data as a probability and a model with base parameters. The probability is decoded using that model parameters with the solver, then Sensor fusion is applied and the target posterior probability calculated. The solver gives a point estimate. Can that point estimate reunified with the model harm the NUTS algorithm?

Good call. One more reply and we’ll leave at that then.

I’m struggling to follow your example, but as long as the target (posterior) is differentiable with respect to all the parameters in the model, it’s fine. I can’t quite tell from your example how using the solver would break differentiability.

Thanks for the data generation suggestion. Indeed, I found there is something wrong with my Newton’s solver. So I’m thinking about how to fix it. Before doing that, I want to give the Stan built-in solver another try. However, I’m not sure if it can handle an implicit function. The equation where I want to solve for z is

\frac{d1}{ \exp(\frac{z-\beta_{01}}{\beta_{11}})} + \frac{d2}{ \exp(\frac{z-\beta_{02}}{\beta_{12}})} = \tau,
where d1, d2 are “x_r[, 1]” and “x_r[, 2]” respectively, and others are parameters.

How should I code this equation as a system that algebra_solver would accept?
Thanks!!

Implicit functions do not, a priori, pose a problem.

Just to be sure, we’re solving for z, right? Unless I’m missing something, this should be fairly straightforward.

The first step is to specify your equation as a root-finding problem, that is solve f(z) = 0 for z. To do this, make f = L.H.S - R.H.S of your equation. Now it’s just a matter writing of f in the functions block of Stan. To do this, we’ll store d1 and d2 in x_r, as you’ve done, and the other coefficients in theta. See chapter 17 in http://www.stat.columbia.edu/~gelman/bda.course/_book/.

Jumping in on your exchange: using the solver does not prevent Stan from accurately calculating gradients. There are pathological cases where the gradients will not be well defined, for instance when your equation is degenerate, or when f(z, \theta) is discontinuous with respect to either the unknown z or the parameters \theta. In such a case, your posterior is discontinuous and you pick up very distinct symptomes (which I believe are not observed here).

I ran into such a problem with a model which embedded a KKT problem (an algebraic equation with an equality and an inequality constraint). See Non-smooth posterior and KKT problem.