Time-varying covariate implementation

Hello! I have attached the stan model code, R script to run analysis and the simulated data file. My code also uses Torsten as the problem is pharmacokinetics related.

In the attached code (to the best of my ability as I have had to adjust a publicly available code set to include the Wt as covariate) WT is treated as a constant covariate on CL. My struggle is to implement covariate WT on CL as a time-varying step-wise input (without interpolation between time points similar to the principles of “last observation carried forward” ). In other words the CL parameter should change over time as a result of WT changing at the specific time points included in the data set. Would appreciate ideas on how to modify the code to achieve WT input as intended for a time-varying stepwise input covariate on CL.

Appreciate your time,

Andras
rstan_full.R (4.1 KB)
nonmem_IV.csv (8.6 KB)
popmodel_IV_wtconstant_og.stan (14.1 KB)

Hey, Andras, thanks for your interest in using Stan for PMx. It’s always good to see more people using it. I’ll write up something for you in the next day or two. In the meantime, check out
this Github repo. The code you pulled from our website you linked above is/should be nearly identical to what’s in the Github repo with possibly some newer updates but definitely a whole bunch more models in the Github repo. I haven’t written a model with time-varying covariates, yet, but it’s on my to-do list, and this might give me a push to do it.

Casey

Thanks Casey, appreciate it… The resources you guys have are great, was very helpful for me to begin working with simpler models, so all you do is much appreciated. Will look through the repo for additional ideas…

Andras

Hey Andras, I am also new to the stan and Torsten world but very excited to continue this path. I don’t have a perfect solution for you, but I think the key is to adjust the theta (or params) object and move it from a 1D array (real[]) to a 2D array (real[ , ]). If you have a look into the current user guide of Torsten (Torsten/docs/torsten_users_guide.pdf at master · metrumresearchgroup/Torsten · GitHub) under page 12 at Table 4.2:

Table 4.2. PMX model parameter overloadings. One can use 1d array real[] to indicate constants of all events, or 2d array real[ , ] so that the ith row of the array describes the model arguments for time interval (ti−1, ti), and the number of the rows equals to the size of time.

There it is described that the theta (or params) object can either be a 1d array real[] if your parameters stay constant for one individual and a 2d array real[ , ] if you have different parameters for one individual for a specific time interval. To my understanding this is exactly the case if we have time-varying covariates and/or inter-occasion variability. Then the params/thetas need to change within one individual. Currently you still use the 1d array real[] version in your code:

for(j in 1:n_subjects){
      real wt_adjustment_cl = (WT[j]/70)^0.75;
      array[4] real params = {CL[j]* wt_adjustment_cl, Q[j], VC[j], VP[j]};
      array[4] real params_tv = {TVCL* wt_adjustment_cl, TVQ, TVVC, TVVP};
      x_ipred[subj_start[j]:subj_end[j],] =
        pmx_solve_rk45(two_cmt_ir1_ode,
                       n_cmt,
                       time[subj_start[j]:subj_end[j]],
                       amt[subj_start[j]:subj_end[j]],
                       rate[subj_start[j]:subj_end[j]],
                       ii[subj_start[j]:subj_end[j]],
                       evid[subj_start[j]:subj_end[j]],
                       cmt[subj_start[j]:subj_end[j]],
                       addl[subj_start[j]:subj_end[j]],
                       ss[subj_start[j]:subj_end[j]],
                       params)';

(...)
}

This will tell Torsten that you assume constant parameters for one individual. If you want to add time-varying covariates, I think you need to add another loop within one individual and loop over each event (= 1 row in you NONMEM-style dataset) and comput the respective parameter value at that point. Here is how it looks in my code:

    // define the time-varying parameter matrix theta [nEvents, nTheta] (n rows = number of time points, per row one set of nTheta params)
    // see docs for details on 1d and 2d theta elements (2d needed for time-varying covariates)
    array[nEvents, nTheta] real theta;
    for (i in 1:nEvents) {
        // extract correct kappas 
        if (CYCLE[i] == 1) {
            KAPPA_CL = KAPPA_CL_C1;
            KAPPA_Q3 = KAPPA_Q3_C1;
        } else {
            KAPPA_CL = KAPPA_CL_C1;
            KAPPA_Q3 = KAPPA_Q3_C1;
        }

        // store thetas
        theta[i, 1] = TVCL * pow(EGFR[i] / 84, TH_EGFR_CL) * exp(ETA_CL + KAPPA_CL);
        theta[i, 2] = TVV1;
        theta[i, 3] = TVQ2;
        theta[i, 4] = TVV2;
        theta[i, 5] = TVQ3 * exp(ETA_Q3 + KAPPA_Q3);
        theta[i, 6] = TVV3;
    }

    // solve the ode system (x = drug amounts in each compartment over time)
    x = pmx_solve_rk45(
        ode_rhs,                          // ODE right hand side system
        nCmt,                             // number of compartments
        time,                             // time of events
        amt,                              // amount of drug administered
        rate,                             // rate of drug administered
        ii,                               // interdose interval
        evid,                             // event id
        cmt,                              // compartment number
        addl,                             // number of additional doses
        ss,                               // steady state flag
        theta,                            // individual parameters
        1e-5,                             // relative tolerance
        1e-8,                             // absolute tolerance    
        1e5                               // maximum stepsize
    );

In this case, EGFR[i] is the time-varying covariate (in mymodel I just have 1 indvidual, see below). Instead of defining

array[4] real params = {CL[j]* wt_adjustment_cl, Q[j], VC[j], VP[j]};

you would need to define per ith row of the respective individual something like this:

array[nEvents, nTheta] real params;
theta[i, 1] = CL[j] * (WT[i,j]/70)^0.75;      // CL
theta[i, 2] = Q[j];                           // Q
theta[i, 3] = VC[j];                          // VC
theta[i, 4] = VP[j];                          // VP

It is surely a bit tricky to get the indexing right and to find a smart way to code it. But this should be the inituiton behind it. The trick will be to pass all objects in form of xx[i,j] and i need to match the elements of the time array.

Please also have a look at the full model code I have posted in another question if you are interested: Simultaneously estimating IIV and IOV estimate for first cycle in pharmacokinetic NLME model using Torsten. It is a bit different since I do not fit a full NLME model there and rather try to replace the MAP estimation (mode of posterior) with a full Bayesian approach for a MIPD scenario. This means I only have 1 individual. But the concept for time-varying covariates and IOV remains the same. Let me know if that was helpful or rather confusing!

This is the way to go (real[] vs. real[ , ]). I’m writing up something now, but there’s the issue of last-observation-carried forward (locf), which I believe Monolix uses, vs. next-observation-carried-backward (nocb), which NONMEM uses. Torsten seems to be using nocb with real[ , ], so I’m testing a hack to see if it’s possible to do whichever you want.

True, I haven’t thought about the difference between locf and nocb. Looking forward for your solution! If Torsten is using nocb, of course you could go for a dense grid of time/covariate elements to make the time where nocb so small, that it barely matters. But this surely will affect performance and is quite a ‘hacky’ solution. I am wondering if you could just shift the covariates one timeslot into the future to make use of the nocb behaviour. Something like this:

# NOCB
time     1    2    3    4    5
covar    a <- b <- c <- d <- e

# LOCF
time     1    2    3    4    5
covar    a -> b -> c -> d -> e

# SHIFTED NOCB (= LOCF)
time     1    2    3    4    5  
covar      <- a <- b <- c <- d <- e      

One would just need to think about how to do that shift and how to deal with the empty slot at time = 1.

Thanks Casey and Marian. Yes, saw your code and post Marian with the single patient implementation of the time-varying covariate but indeed it is the proper indexing that I had trouble grasping… And did not think of Casey’s lead on the (real[] vs. real[ , ] ) so excited to see how he will handle it. Played around with the array option based on your example to make it 2d but the robustness of the code developed by Casey’s team honestly is a bit above my head at this point in Stan when it comes to ensuring proper indexing.

With regards to locf or nocb I guess if it there is a solid workable solution for nocb then pursuing locf would not be necessary (although would be neat to have the trick implemented as most would expect -in my opinion- to handle the “input” that way vs nocb) and should be easy to implement by adjusting the data set to reflect those expectations.

with regards to a “smoother” implementation and to avoid a highly dense grid could we not take advantage of Torsten’s linear extrapolation function in some way?

am a bit more familiar with deSolve in R and here is a simple example for time varying inpiut which gave me flexibility on implementing time-varying inputs for various problems in the past… Guess the idea would be to have the results of the extrapolation function enter the diffeq solver at a small step size instead of the covariate data directly from the data set but not sure if same/similar can be done or makes any sense in our problem in Stan and I would think the small(er) step size would impact computations like a dense grid would…

thanks again for spending time on trying to figure this out,

Andras

That’s a huge Stan program to wrestle with if you’re just starting out. The relevant declaration for WT is

data {
  vector<lower = 0>[n_subjects] WT;

I don’t see a variable named CL anywhere. One of the ODE functions has

 real cl = params[1];

and then there’s a line before where the function is called with

real wt_adjustment_cl = (WT[nn] / 70)^0.75;
...
 real cl = theta_nn[1] * wt_adjustment_cl;

This will take your constant WT values (that wt_adjustment_cl can be done once for the whole vector in the transformed data block and re-used). This is just going to have cl (if that’s the same as CL) being defined as you have it above.

Was there a problem with that? Did you want to do a more elaborate regression for cl?

@charlesm93 or @billg were involved in developing Torsten, so they should know how this works.

The built-in linspaced_int_array(start - end + 1, start, end) returns the same value as int_sequence(start, end).

array[] int sequence(int start, int end) { 
    array[end - start + 1] int seq;
    for (n in 1:num_elements(seq)) {
      seq[n] = n + start - 1;
    }
    return seq; 
  } 

Hi Bob,

yes, the code I posted works as intended, it adjusts cl for WT via the wt_adjustment_cl , but in this model WT does not change over time.

The challenge we are trying to tackle is to adjust the code to add WT in as a time-varying covariate in the Stan data set (can be generated with the R code and the csv file I also attached) so each subject (n_subjects) would have a different WT value inputted for each time point (n_total) so that as WT changes per time point so would wt_adjustment_cl change, that in turn would yield a different cl value for those time points based on the relationship of (properly indexing this equation of course, the task we are after):

real cl = theta_nn[1] * wt_adjustment_cl;

Marian’s note above has the core of what we are trying to do and similar to his implementation in his code for a single subject (using a different model example) is what we are trying to achieve but for a hierarchial model (my attached adapted code from Casey’s team as a starting point) used in population pharmacokinetic estimations when several subjects data is used during estimations together.

Added difficulty is the locf issue but that we can probably live without that as may have some tricks we could do in the data set itself to account for the presumed behavior of Torsten (nocb) if everything else works as intended…

I probably was not very clear here so welcome Marian or Casey to further elaborate…

thanks,

Andras

I agree that the model from @cbdavis33 is already quite a complex example for just getting the feet wet with concepts such as time-varying covariates. It seems to be a very solid stan program and is already quite optimized for performance by using within-chain paralellizations and reduce_sum()(if I am not mistaken). But this performance gain comes at the cost of increased complexity and a steeper learning curve. I personally would rather accept a lower performance for now and code it in an more intuitive way from scratch to properly get familiar with everything. Then it is a great example to step up and include the withing-chain paralellization and the functional programming based on @cbdavis33 model for better performance.

If you would like to have further references, you could check out the blog from Danielle Navarro such as

and

I personally found these blog posts very helpful during the first steps.

No, I think you already did a good job explaining! In NLME models, we typically estimate parameters (thetas) as typical values for the population. Explainable deviations from this typical value are coded via covariate relationships, such as a given increase in clearance (cl) based on body weight. If this covariate is constant for one individual (i.e. we have only measured the weight once at the beginning), we can use the 1d array theta[] in Torsten. In that case, Torsten will solve the ODE system with that weight-adjusted cl for the whole timeframe of that individual.

However, if we have multiple measurements for body weight (time-varying covariates), then we need to break the ODE-solver once a new body weight measurement becomes available, calculate the new resulting cl and then start the ODE-solver again (based on the amounts in the compartments) with the updated cl. It might be needed multiple times during the timeframe of one individual. This procedure seems to be (very conveniently) implemented in Torsten where we just pass a 2D array theta[ , ] to the pmx_solve_rk45() function and Torsten will take care of breaking and re-starting the ODE-solver for us.

In @Andras_Farkas case, he would now have to adapt the passed data structure and the logics to translate from real[] to real[ , ].

I agree. The issue with locf vs nocb vs. a linear interpolation is relevant, but maybe shouldn’t be a priority. Switching from real[] to real[ , ] for the params and adapting the passed data structure accordingly is key to implement time-varying covariates.

Hi, @Andras_Farkas and @marianklose, I did some work on this and added it to my stanpmx-library Github repo for both a depot-1cmt and iv-2cmt. The directory names should be clear.

True, and this would be great, but then it’s going to have to solve at each of those dense gridpoints, and it would slow the solver down massively.

This is I think the way to do it. That’s how Monolix talks about doing it. So if that’s how they do it, how you thought about doing it, and how I thought about doing it, then it’s probably the best way to go. I did a bunch of testing this idea against mrgsolve to make sure this nocb and locf matches their nocb and locf in here, but there still might be a case where they don’t match.

lag and lead in R with a proper default with first() or last() should take care of that, as here:

if(isTRUE(locf)){
  nonmem_data_simulate <- nonmem_data_simulate %>%
    group_by(ID) %>%
    mutate(WT = lag(WT, default = first(WT)),
           EGFR = lag(EGFR, default = first(EGFR)),
           CMPPI = lag(CMPPI, default = first(CMPPI))) %>%
    ungroup()
}

Once it’s in Stan, it’s going to do nocb, but you can prepare the dataset as above with lag() to input the data in such a way that it mimics locf.

This is a really good idea. I didn’t look into this at all, but I’ll take a look at it. I like the approach for certain covariates that would change continuously (like weight), but the locf vs. nocb still matters when thinking about discretely changing covariates (like fed vs. fasted). Thanks for this suggestion. It could be very useful.

Yeah, this code isn’t really for someone new to Stan. The links to Danielle’s blog would definitely be more helpful for just starting out. I think our website linked above has an intro section, and another website I started working on that was supposed to be both Stan (me) and NONMEM (someone else) might have some helpful information. Both websites have been left unmaintained for at least a couple years, though, but I can start working on the stanpmx one if you think it’d be helpful.

If the parameter in the exponent (0.75 here) is fixed, then this is true, but if it’s estimated, then it can’t be. I normally don’t fix it, but the Github repo linked in this thread has examples for both.

Nice! I wrote the sequence() function probably 5-8 years ago based on one of your posts (I might’ve copied it exactly. I don’t remember.) I think when the message boards were still a Google group and never updated it with linspaced_int_array(), but I’ll use it from now on. Thanks.

To get to the main topic of the post, I’ve attached two files that fit this type of data. The ..._wtcl.stan file fits specifically to your data and does specifically what you’re looking for - time-varying weight on CL. It takes the time-varying weight, adjusts the clearance each time the weight changes, keeping all the other covariates (and parameters) constant within each individual.

      int n_total_subj = subj_end[j] - subj_start[j] + 1;

      array[n_total_subj, n_random + 1] real params_to_input;
      params_to_input[, 1] = to_array_1d(CL[subj_start[j]:subj_end[j]]);
      params_to_input[, 2] = rep_array(Q[j], n_total_subj);
      params_to_input[, 3] = rep_array(VC[j], n_total_subj);
      params_to_input[, 4] = rep_array(VP[j], n_total_subj);
      params_to_input[, 5] = zeros_array(n_total_subj); // using Torsten's analytical solution, it needs
                                                        // an absorption rate constant, so give it 0.
        
      x_ipred[subj_start[j]:subj_end[j],] =
        pmx_solve_twocpt(time[subj_start[j]:subj_end[j]],
                                       amt[subj_start[j]:subj_end[j]],
                                       rate[subj_start[j]:subj_end[j]],
                                       ii[subj_start[j]:subj_end[j]],
                                       evid[subj_start[j]:subj_end[j]],
                                       cmt[subj_start[j]:subj_end[j]],
                                       addl[subj_start[j]:subj_end[j]],
                                       ss[subj_start[j]:subj_end[j]],
                                       params_to_input)';
                    
      dv_ipred[subj_start[j]:subj_end[j]] = 
      x_ipred[subj_start[j]:subj_end[j], 2] ./ VC[j];

Side note: There might be a better way to create params_to_input with append_array(), but I didn’t mess with it too much.

The ..._generic.stan file should work with any combination of time-varying and constant covariates. If a covariate is time-varying, then whichever relevant parameter adjusts when that covariate changes. If a covariate is constant, then as long as it is input the same as for a time-varying covariate (i.e. as a vector with one cell for each row of the dataset instead of as a scalar), then it will work fine

     ...

     int n_total_subj = subj_end[j] - subj_start[j] + 1;

      array[n_total_subj, n_random + 1] real params_to_input;
      params_to_input[, 1] = to_array_1d(CL[subj_start[j]:subj_end[j]]);
      params_to_input[, 2] = to_array_1d(Q[subj_start[j]:subj_end[j]]);
      params_to_input[, 3] = to_array_1d(VC[subj_start[j]:subj_end[j]]);
      params_to_input[, 4] = to_array_1d(VP[subj_start[j]:subj_end[j]]);
      params_to_input[, 5] = zeros_array(n_total_subj);

      x_ipred[subj_start[j]:subj_end[j],] =
        pmx_solve_twocpt(time[subj_start[j]:subj_end[j]],
                                       amt[subj_start[j]:subj_end[j]],
                                       rate[subj_start[j]:subj_end[j]],
                                       ii[subj_start[j]:subj_end[j]],
                                       evid[subj_start[j]:subj_end[j]],
                                       cmt[subj_start[j]:subj_end[j]],
                                       addl[subj_start[j]:subj_end[j]],
                                       ss[subj_start[j]:subj_end[j]],
                                       params_to_input)';

     dv_ipred[subj_start[j]:subj_end[j]] = 
        x_ipred[subj_start[j]:subj_end[j], 2] ./ VC[subj_start[j]:subj_end[j]];

     ...

The main differences to notice are that Q, VC, andVP are constant within each subject for the first one, hence VC[j] (constant) vs. VC[subj_start[j]:subj_end[j]]) (time-varying). The similarity between the two is that the input parameters (Torsten’s theta, my params_to_input) is now a 2-D array with a row that corresponds to each row of the dataset rather than a 1-D array that contains constant parameters.

iv_2cmt_ppa_covariates_time_varying_generic.stan (18.8 KB)
iv_2cmt_ppa_covariates_time_varying_wtcl.stan (16.3 KB)

@Andras_Farkas and @marianklose, hopefully this is helpful. This isn’t exactly starter material, so feel free to follow up if you need any clarification. Also, use the code in the Github repo, not from the website. I’ve made a few notable changes that should make the code a little easier to work with (still not easy, though). One more tip, Andras - you’ve done a nice job adapting code to your use case, but don’t use an ODE solution unless it’s necessary. The analytical solution (Torsten’s pmx_solve_one[two]cpt()) or matrix-exponential (Torsten’s pmx_solve_linode()) will be much. much faster and will save you a ton of time and headaches if they’re available for your problem.

P.S. I also wrote this model with the matrix-exponential solution, and it’s in the Github repo if you want to have a look. With 2-cmt IV models with constant parameters, the linear ODE (matrix-exponential) is noticeably faster than Torsten’s analytical solution, but when it has time-varying covariates, the difference was negligible with my testing, and I could see a case where it would be slower with more rows in the dataset, due to a loop to create a matrix for each row in the dataset that would take longer with more rows. I touch briefly on the difference in speed of the analytical vs. matrix-exponential solutions in this paper where I talk a lot about the within-chain parallelization using reduce_sum().

1 Like

Hey @cbdavis33,

That’s a great repo for reference, thanks a lot for sharing! It’s always good to have some example code to get started.

I agree, it should be possible to bring it to work. But probably the easiest way would be if the Torsten developer team allows for a locf / nocb option in their functions. If they have the capacity, they could look into it for a future release. I will suggest it in their GitHub: [FEATURE] Allow for last observation carried forward (locf) or next observation carried backwards (nocb) for time-varying covariates · Issue #71 · metrumresearchgroup/Torsten · GitHub

Looks reasonable! I might give it a shot in the future once I have some time to look into it.

Makes sense. I would just like to note that the same concept of implementing time-varying covariates also applies for the implementation of inter-occasion variability (IOV). Also there you would have to define different values for a given parameter with IOV per individual, so this would result in a similar solution. Just in case someone is looking for the way how to implement IOV in their stan models.

Very helpful, thanks for sharing!

I’ll give Yi, Charles, and Bill a push when I meet with them in a couple weeks.

Yeah, IOV has been on my list to add to the stanpmx-library repo, but I’ve never gotten around to it. Your work in the link above will be hugely helpful if I ever get around to writing it.

One more note about that Github repo - some of the simpler models like depot_1cmt_linear have models fit without within-chain parallelization in the Fit/ directory and are indicated by .._no_threading.stan. They remove the reduce_sum() and partial_sum_lpmf() parts, so they’re a bit easier to read. They’re still not necessarily easy for someone just getting started in Stan (BLOQs, non-centered parameterization, …), but might bridge a gap between just starting out and within-chain parallelization.

1 Like

This is surely helpful for the community, thanks. I am also still quite new to stan and torsten so having these resources is very nice. As pointed out above, I rather accept some non-optimized code in the beginning when I learn the concepts to then gradually increase the complexity. I have also read your paper about within-chain parallelization, it is very helpful and well-written! However, I have realized that implementing this would be one step to early before I have fully undersood some more fundamental aspects of stan/torsten. But it is on my list and I am happy to tackle it at a more advance stage in the future.

Casey and Marian,

thank you both, very exciting to have these available. Sorry am doing some validations and will report back and so far excellent results… Will be able to close this out early next week…

thanks,

Andras

1 Like

Many thanks to you both, done some more “complex” simulations with time varying covariates and the system nicely recapitulated the analysis of interest.

Casey: your repo is very useful for folks like me, great starting points to then move forward with as knowledge base expands.

thank you for the time you invested in helping to solve the problem,

Andras

1 Like