Simultaneously estimating IIV and IOV estimate for first cycle in pharmacokinetic NLME model using Torsten

Hello everyone! I am currently setting up Stan with Torsten for a pharmacokinetic NLME (popPK) model. Huge thanks to the Stan and Torsten developers for all their contributions!

A bit of context: We work with a NLME model for a oncologic drug but we do not currently plan to perform a full Bayesian parameter estimation. Instead, we aim to use an existing fit obtained via Laplacian maximum likelihood estimation in NONMEM. Our goal is to individualize the parameters for a given patient once drug concentration measurements become available for that individual using stan.

The application scenario is a model-informed precision dosing app, where a clinician inputs all relevant covariates and measurements and receives a prediction with a 95% credible interval. The NONMEM model provides estimates for the variance of inter-individual (IIV) and inter-occasion variability (IOV) for parameters such as clearance. One occasion represents one cycle of drug administration. Typically, people either use the mode of the posterior via MAP estimation or approximate the posterior as a normal distribution. We aim to replace these biased approaches with a full Bayesian framework using Stan and Torsten.

Here is my current model:

/////////////////////////////////////////////////////////////////////////////////////////////////////
//////////////////////////////// PREAMBLE ///////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////////////////////////////

// stan representation of run5025
// 1. Based on:             run5013
// 2. Description:          train / log / 3cmt / iiv:CL/Q3 / iov:CL/Q3 / cov: eGFR on CL (pow) / M3 / no CWRES > 5 / LAPLACIAN
// 3. Structural model:     3cmt
// 4. Dispositon:           fo
// 5. IIV:                  CL/Q3
// 6. IOV:                  CL/Q3
// 7. Covariates:           eGFR on CL (pow)
// 8. RUV:                  exp
// 9. Estimation:           LAPLACIAN
// 10. Author:              Marian Klose

/////////////////////////////////////////////////////////////////////////////////////////////////////
//////////////////////////////// FUNCTIONS BLOCK ////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////////////////////////////

// functions data block
functions {
    vector ode_rhs(
        real t,
        vector x,
        array[] real parms,
        array[] real x_r,
        array[] int x_i
    ){
        // extract parmeters from prms object in defined order, CL/V1/Q2/V2/Q3/V3
        real CL = parms[1];
        real V1 = parms[2];
        real Q2 = parms[3];
        real V2 = parms[4];
        real Q3 = parms[5];
        real V3 = parms[6];

        // define rate constants based on params
        real k10 = CL / V1;
        real k12 = Q2 / V1;
        real k21 = Q2 / V2;
        real k13 = Q3 / V1;
        real k31 = Q3 / V3;

        // define ODE system
        vector[3] y;
        y[1] = - k10 * x[1] - k12 * x[1] - k13 * x[1] + k21 * x[2] + k31 * x[3];
        y[2] = k12 * x[1] - k21 * x[2];
        y[3] = k13 * x[1] - k31 * x[3];

        // return the amount in the compartments
        return y;
    }
}

/////////////////////////////////////////////////////////////////////////////////////////////////////
//////////////////////////////// DATA BLOCK /////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////////////////////////////

// input data block (passed from R)
data {
    // events
    int<lower = 1> nEvents;                 // number of events

    // observations
    int<lower = 1> nObs;                    // number of observations
    array[nObs] int<lower = 1> iObs;        // indices of events which are observations
    vector<lower = 0>[nObs] cObs;           // observed concentration (the dv) used to generate posterior

    // NONMEM standard event items
    array[nEvents] int<lower = 1> cmt;      // compartment number
    array[nEvents] int evid;                // event id (0=observation, 1=dose, 2=other)
    array[nEvents] int addl;                // number of additional identical doses given
    array[nEvents] int ss;                  // is it steady-state dosing? (0=false, 1=true)
    array[nEvents] real amt;                // dose amount administered at this time
    array[nEvents] real time;               // time of observation/administration
    array[nEvents] real rate;               // rate of drug infusion (0 for bolus administration)
    array[nEvents] real ii;                 // interdose interval: time between additional doses

    // typical parameter values
    real<lower = 0> TVCL;                   // typical value of central clearance
    real<lower = 0> TVV1;                   // typical value of central volume
    real<lower = 0> TVQ2;                   // typical value of shallow intercompartmental clearance
    real<lower = 0> TVV2;                   // typical value of shallow peripheral volume
    real<lower = 0> TVQ3;                   // typical value of deep intercompartmental clearance
    real<lower = 0> TVV3;                   // typical value of deep peripheral volume

    // fixed covariate effects
    real<lower = 0> TH_EGFR_CL;             // fixed effect of creatinine clearance on clearance

    // measured covariates 
    array[nEvents] real<lower = 0> EGFR;    // estimated glomerular filtration rate [ml/min]

    // cycle information
    array[nEvents] int<lower = 0> CYCLE;    // cyle number, needed for IOV

    // iiv standard deviations
    real<lower = 0> IIV_SD_CL;              // IIV (sd) on central clearance
    real<lower = 0> IIV_SD_Q3;              // IIV (sd) on intercompartimental clearance Q3

    // iov standard deviations
    real<lower = 0> IOV_SD_CL;              // IOV (sd) on central clearance
    real<lower = 0> IOV_SD_Q3;              // IIV (sd) on intercompartimental clearance Q3

    // ruv parameters
    real<lower = 0> EXP_RUV_SD;             // RUV (sd) for exponential error model
}

/////////////////////////////////////////////////////////////////////////////////////////////////////
//////////////////////////////// TRANSFORMED DATA BLOCK /////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////////////////////////////

// transformed data block
transformed data {
    vector[nObs] logCObs = log(cObs);           // log-transformed observations
    int nTheta = 6;                             // number of parameters for ODE system
    int nCmt = 3;                               // number of compartments (1 = central, 2 = shallow peripheral, 3 = deep peripheral)
}

/////////////////////////////////////////////////////////////////////////////////////////////////////
//////////////////////////////// PARAMETERS BLOCK ///////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////////////////////////////

// parameters block
parameters {
    // clearance
    real ETA_CL;              // ind. iiv estimate
    real KAPPA_CL_C1;         // ind. iov estimate (cycle 1)

    // intercompartmental clearance q3
    real ETA_Q3;              // individual iiv q3 estimates
    real KAPPA_Q3_C1;         // individual iov q3 estimates
}

/////////////////////////////////////////////////////////////////////////////////////////////////////
//////////////////////////////// TRANSFORMED PARAMETERS BLOCK ///////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////////////////////////////

// transformed parameters block
transformed parameters {
    // predicted drug concentrations
    matrix<lower = 0>[nCmt, nEvents] x;  // drug amounts in each compartment over time
    row_vector<lower = 0>[nEvents] cHat; // row vector, one element per event
    vector<lower = 0>[nObs] cHatObs;     // column vector, one element per *observation*

    // retrieve correct KAPPA_CL and KAPPA_Q3 values
    real KAPPA_CL = 0;
    real KAPPA_Q3 = 0;

    // define the time-varying parameter matrix theta [nEvents, nTheta] (n rows = number of time points, per row one set of nTheta params)
    // see docs for details on 1d and 2d theta elements (2d needed for time-varying covariates)
    array[nEvents, nTheta] real theta;
    for (i in 1:nEvents) {
        // extract correct kappas 
        // !!FIX ME: ADJUST KAPPA ACCORDING TO CYCLE; FOR NOW CYCLE 1 ONLY!!
        if (CYCLE[i] == 1) {
            KAPPA_CL = KAPPA_CL_C1;
            KAPPA_Q3 = KAPPA_Q3_C1;
        } else {
            KAPPA_CL = KAPPA_CL_C1;
            KAPPA_Q3 = KAPPA_Q3_C1;
        }

        // store thetas
        theta[i, 1] = TVCL * pow(EGFR[i] / 84, TH_EGFR_CL) * exp(ETA_CL + KAPPA_CL);
        theta[i, 2] = TVV1;
        theta[i, 3] = TVQ2;
        theta[i, 4] = TVV2;
        theta[i, 5] = TVQ3 * exp(ETA_Q3 + KAPPA_Q3);
        theta[i, 6] = TVV3;
    }

    // solve the ode system (x = drug amounts in each compartment over time)
    x = pmx_solve_rk45(
        ode_rhs,                          // ODE right hand side system
        nCmt,                             // number of compartments
        time,                             // time of events
        amt,                              // amount of drug administered
        rate,                             // rate of drug administered
        ii,                               // interdose interval
        evid,                             // event id
        cmt,                              // compartment number
        addl,                             // number of additional doses
        ss,                               // steady state flag
        theta,                            // individual parameters
        1e-5,                             // relative tolerance
        1e-8,                             // absolute tolerance    
        1e5                               // maximum stepsize
    );

    // compute drug concentrations
    for (i in 1:nEvents) {
        cHat[i] = x[1, i] / theta[i, 2];        // need to loop even if v1 remains constant
    }          
    cHatObs = to_vector(cHat[iObs]);            // transform to column vector & keep relevant cells only
    vector[nObs] log_cHatObs = log(cHatObs);    // log-transformed predictions
}

/////////////////////////////////////////////////////////////////////////////////////////////////////
//////////////////////////////// MODEL BLOCK ////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////////////////////////////

// model block
model {
    // informative priors for clearance
    ETA_CL ~ normal(0, IIV_SD_CL);              // IIV (sd) on central clearance
    KAPPA_CL_C1 ~ normal(0, IOV_SD_CL);         // IIV (sd) on intercompartimental clearance Q3

    // informative priors for intercompartmental clearance Q3
    ETA_Q3 ~ normal(0, IIV_SD_Q3);              // IOV (sd) on central clearance
    KAPPA_Q3_C1 ~ normal(0, IOV_SD_Q3);         // IIV (sd) on intercompartimental clearance Q3

    // likelihood function for exponential error model (expects SD for normal distribution)
    logCObs ~ normal(log_cHatObs, EXP_RUV_SD);
}


/////////////////////////////////////////////////////////////////////////////////////////////////////
//////////////////////////////// GENERATED QUANTITIES BLOCK /////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////////////////////////////

// generated quantities block
generated quantities {
    // sample prior etas
    real prior_ETA_CL = normal_rng(0, IIV_SD_CL);           // IIV (sd) on central clearance
    real prior_KAPPA_CL_C1 = normal_rng(0, IOV_SD_CL);      // IIV (sd) on intercompartimental clearance Q3
    real prior_ETA_Q3 = normal_rng(0, IIV_SD_Q3);           // IOV (sd) on central clearance
    real prior_KAPPA_Q3_C1 = normal_rng(0, IOV_SD_Q3);      // IIV (sd) on intercompartimental clearance Q3

    // sample posterior
    array[nObs] real logcObsPred;  // introduce observations for posterior
    for(i in 1:nObs) {
        logcObsPred[i] = normal_rng(log_cHatObs[i], EXP_RUV_SD);
    }
}

As you can see, we aim to obtain individual estimates for the IIV and IOV terms of the parameters CL and Q3, which are named ETA_CL, KAPPA_CL_C1, ETA_Q3, and KAPPA_Q3_C1. For simplicity, I am currently focusing only on Cycle 1. Later, we plan to estimate a separate KAPPA_x for each cycle.

This brings me to my question: In the first cycle of an individual, we still need/want to estimate both the IIV and IOV terms for a given parameter. However, without multiple cycles, it becomes mathematically challenging to separate the IIV and IOV components, as both enter the equation in the same way:

theta[i, 1] = TVCL * pow(EGFR[i] / 84, TH_EGFR_CL) * exp(ETA_CL + KAPPA_CL);

This is also being evident in my mcmc_pairs() plot:

For instance, we see a strong autocorrelation between ETA_CL and KAPPA_CL_C1. The mcmc_areas() plot also show a similar behaviour where the posteriors are pushed towards the same direction, while the estimate with the higher CV% (= IIV terms) being allowed to deviate further from zero.

This raises a few concerns:

  1. Do I need to worry about that autocorrelation?
  2. Is it even meaningful to attempt estimating both IIV and IOV in the first cycle, where they cannot be clearly distinguished?
  3. What would be a reasonable alternative approach?

Would it make sense to merge the variance of IIV and IOV in the first cycle - estimating a single joint term - and then separate them once we enter the second cycle? The rest of the diagnostics appear well-behaved, but in other cases, I encountered chain terminations and errors, especially when the data was uninformative for CL and/or Q3.

# get summary
out$summary()
# A tibble: 643 Ă— 10
   variable       mean   median      sd     mad       q5      q95  rhat ess_bulk
   <chr>         <dbl>    <dbl>   <dbl>   <dbl>    <dbl>    <dbl> <dbl>    <dbl>
 1 lp__       -9.01e+0 -8.67e+0  1.41    1.20   -1.17e+1 -7.39e+0  1.00    1554.
 2 ETA_CL     -1.39e-1 -1.40e-1  0.125   0.124  -3.47e-1  6.98e-2  1.00    1724.
 3 KAPPA_CL_… -7.09e-2 -7.10e-2  0.123   0.124  -2.75e-1  1.30e-1  1.00    1693.
 4 ETA_Q3      3.95e-1  3.92e-1  0.209   0.210   5.39e-2  7.44e-1  1.00    2036.
 5 KAPPA_Q3_…  3.14e-1  3.21e-1  0.205   0.204  -2.51e-2  6.39e-1  1.00    1928.
# â„ą 633 more rows
# â„ą 1 more variable: ess_tail <dbl>

I would be very grateful for any advice. Thank you!

Marian

Tagging some people who know a lot about Torsten in case they have some ideas for you. @charlesm93 @yizhang @billg

1 Like

Thanks!

Howdy! I don’t work with these types of models or Torsten (so hopefully those Jonah tagged will reply), but that sort of additive degeneracy is telling you that you can’t learn much about either individual parameter but only the combination of the two. When I see that sort of additive degeneracy in my own models, I typically think of either adding more information to inform one or both parameters, or giving up and combining the two parameters into a single parameter. More information can be added either through stronger priors if the information is available, and/or through additional data that inform those particular parameters through another model. For example, if you had some auxiliary data that informed a model of ETA_CL and/or KAPPA_CL_CI, then you could add those data and model(s) to your existing Stan model above and perhaps better infer each individual parameter.
Michael Betancourt has a helpful writeup on this: Identity Crisis

2 Likes

@jd_c put their finger on the problem. You can see the non-identifiability here. The likelihood doesn’t identify ETA_CL or KAPP_CL, only the prior does. We discuss exactly this case in the problematic posteriors chapter of the User’s Guide.

Is that conditional a typo? The two branches are identical. If that isn’t a typo, there’s no need to define these auxiliary parameters like KAPPA_CL, which is always identical to KAPPA_CL_C1.

P.S.

That explains the all-caps Fortran-style variable naming convention!

1 Like

Thanks a lot for the answers, I really appreciate it!

That makes sense. I would definitely claim that the data from this first cycle allows to inform the individual clearance (=parameter) in this first cycle. But it seems impossible to decide whether this individual clearance in the first cycle is coming from a deviation of the individual clearance from the typical value of the population (= inter-individual variability, iiv) or just a cycle-specific deviation (= inter-occasion variability, iov).

Sounds reasonable. In our case we are limited to clinically available data at a given point and we have to deal with this level of information. The clinician is still interested in a prediction and unfortunately we cannot just wait until we have collected more data to run the model. Combining both parameters (ETA_CLand KAPPA_CL) into a single parameter (CL) for this first cycle seems to be the logical choice. Since we assume independence and a normal distribution centered around 0 for both ETA_CLand KAPPA_CL, we should be able to simply add the variances if I am not mistaken. Once we have data from at least two cycles, we hopefully should be available to go back to inform both ETA_CLand KAPPA_CL seperately.

This is a good idea! However, we actually have a very good idea about the prior for a given individual since we have already studied a similar population during the NLME step and came up with the given variance estimates for the IIV and IOV part. Deviating from this is typically hard to justify. And adding more data is (as written above) not feasible since we deal with a real-world scenario where a prediction needs to be made at a given time point.

Thanks again for the valuable input!

1 Like

Thanks for pointing this out! I will have a look again at the User’s Guide. I guess it makes sense that only the prior allows for some distinction of ETA_CL and KAPP_CL, but the likelihood only informs CL itself, not necessarily each of the two parts.

Well spotted. Not really a typo, the model is rather at a debugging state. I have decided to first get it running with CYCLE = 1 data only and then adapt it for multiple cycles. NONMEM typically complains when you have conditionals without a general else block in the end, so I guess it is just a habit. But I got already some good input in Dynamically specifying which parameters to estimate in stan how to do it without a conditional.

Yep, a good way to spot a NONMEM / Fortran user ;-)

@marianklose This is a cool model, and I don’t have much to add beyond what @jd_c said well.

One quick thing to bring up in the model you’ve written there - since the system of ODEs is linear, try solving it with a matrix-exponential using Torsten’s pmx_solve_linode(). It’ll provide huge speedups once you’re in a place that runtimes become an issue. See here for example code for this model. Other than that, this is nice.

@cbdavis33 Thanks a lot!

Good point! To be honest, I was not fully sure when exactly a system of ODEs is linear so that I can use pmx_solve_linode(). After reading a bit into the topic, I agree that this is a logical next step. Luckily the sampler is still quite fast as we only have a single ID and a couple of parameters with a quite informative prior from our IIV and IOV estimates obtained from the NONMEM run. Once you implement multiple cycles and more data for the parameter individualization, you can already feel the slowdown, so thanks for pointing it out!

Apologize for being late to the thread.

As pointed out the additive IIV & IOV pose unidentifiability. For the first dose cycle, you don’t need to add IOV as the rest observations of this subject can be anchored to the first one, in order to account for the variation between the visits.

Since we are focusing on personalized prediction, I wonder if it makes sense to use EBE pred from NONMEM instead of the population TVs. This way the IIV becomes unnecesary and we only need to focus on subject param and IOV around it. Depends on dosing schedule I guess it’s also worth to think over if there is indeed IOV of CL, or it’s just caused by observation noise.

1 Like

@yizhang Thanks a lot for your answer!

No worries. I also just returned to office after one week of absence.

I agree that we face some unidentifiability issues in the first cycle and that we have to deal with it somehow. However, I am not sure if I agree with the suggested solution. To my understanding, NONMEM is estimating the variance of the IOV part based on all cycles available and is not taking the first cycle as some kind of reference (which would result in KAP_CL = 0 for the first cycle). I just had a look into the sdtab file and there we can see that ETA1 (EBE for IIV) remains constant within one ID (as expected) and that KAP_CL (EBE for IOV) changes per cycle and that it is unequal to 0 for the first cycle:

ID             CYCLE             CL             ETA1               KAP_CL 
1.0000E+00     1.0000E+00        1.2525E+00     -2.4793E-02        1.6011E-02
(...)
1.0000E+00     2.0000E+00        1.0585E+00     -2.4793E-02       -2.3082E-02
(...)
1.0000E+00     3.0000E+00        1.2750E+00     -2.4793E-02       -7.0252E-03

The underlying equation for CL is:

TVCL = THETA(1) * (EGFR_I/84)**THETA(7) * EXP(ETA(1) + KAP_CL)

Of course ETA1 and KAP_CL are based on all observations and cycles, so during model building we do not have that unidentifiability issue. To my understanding I would underestimate the variability on CL if I simply discard the IOV part for the first part and simply focus on the IIV part. I would rather suggest to sum up the variances of IIV and IOV for the first cycle and then estimate only one posterior for the joint ETA(1) + KAP_CL part. Once we have informative data in the second cycle, I would estimate the posterior seperately for IIV and IOV. Any objections or am I missing something?

Of course we could switch to the mode of the posterior (EBE) for IIV, but then we would miss out important parts of the variability driven by IIV. I am not sure how to interpret the resulting posterior predictions if some parts are fixed to the mode of the posterior (IIV) and some parts represent the full posterior (IOV).

Good point. But in our clinical use-case inter-occasion variability is quite plausible and gos beyond normal observations noise. There I am quite confident that it is needed in the model.