Divergent transitions in PK models


Rate = -2 is a NONMEM convention, allowing you to specify in the model, rather than the data, the rate/duration

Rate = -1 means it will expect a hard-coded rate
Rate = -2 means look to a hard coded duration

so you can specify D_<CMT> = 1 which is interpreted as, give a 1 hour infusion for all amounts specied to that particular compartment. This isn’t terribly hard to calculate in this day and age, so (potentially) may not be needed in stan, but it can be nice when you have a consistent clinical guidance, and don’t have to go through the motion of df %>% mutate(RATE = AMT/inf_length)

1 Like

nice. I can provide a more expansive “real” .dataset for you to chew on. Obviously can’t share the real data online, bu maybe we should discuss offline. I’d spoken with Charles about a nonmem vs stan comparison, with a combination of simulated scenarios (know the truth) and real data (see where/if there are differences). We should talk a little offline to figure out if we can get a data sharing agreement in place.

1 Like

Rate = -2 is a NONMEM convention, allowing you to specify in the model, rather than the data, the rate/duration

Rate = -1 means it will expect a hard-coded rate
Rate = -2 means look to a hard coded duration

Thanks for the details on NONMEM. Torsten currently works in the Rate = -1 regime. You can always specify a rate when you hand-code an ODE system in the functions block or a constant rate matrix. Less flexibility when you call a built-in analytical solution.


I assume that amt=30 and rate=-2 in csv means unknown duration. Is this
feature included in torsten? NONMEM suggests to model it as a random effect
around nominal duration. Are there any better approaches?


Yes it could be estimated, both with or without a random effect, but does not have to. This can be valuable when you want to fit, for example, a zero-order process, so you can just provide the amount you are aware will enter the system and the profile and let the duration be sorted out to optimize the fit. For complex absorption processes such as tablets with release profiles or intramuscular injections this is a great way to estimate such processes. In such cases, as NONMEM suggests, estimating both a nominal duration with a random effect can be valuable, however if there isn’t that much information about the absorption process sometimes this is not feasible and you can only use a fixed effect (though STAN it would probably be reasonable to apply a reasonable prior based on physiochemical/in vitro characteristics, given it is more flexible than nonmem)

or like in this case with Vanco, everyone just gives a 1 hour infusion, do rather than calculating the rates for each individual, it can just be hard coded.

The rule of thumb will generally come back to physiology and mechanistic plausibility. If you have a intramuscular injection that is injected in the glute for lipophilic (fat loving) compound, it stands to reason that you could have gender differences, and will almost certainly have differences between patients based on their body composition near the injection site. A morbidly obese woman (women have slightly higher fat contents) vs an athletic man could have drastically difference release profiles. On the flip side, some new microsphere formulations are really good for controlled release that don’t vary much between patients (maybe only 10%), in that case, having no random effect would be a reasonably simplification.


Here is a profile for 422. I didn’t expect it to be so good. I have assumed
IV duration of 1h, 1cmp. Brown lines denote the start of IV, red circles -
actual observations, green - median of true conc, blue - 95% CI of true
conc. I suspect it will be even better after modeling of correlation
between parameters - it looks like there is a correlation between cl and
egfr. I used additive error model - perhaps this could be improved as well.
Not sure if duration should have a random effect - still didn’t grasp how
to incorporate it.

Should we talk offline to fit more patients?


good job! A couple things:

  1. the error model is best to be multiplicative/proportional for concentration data. Generally such constant CV residual error models are appropriate for concentration-time data. For individual fits, you can generally ‘get away’ with additive, since the range for this individual, however the misspecification of the residual error model will become very visible over larger concentration ranges. This can be inspected by looking at residuals vs predicted concentration. If you see a fanning pattern (larger magnitude of residuals for higher concs) you’ll definitely know you need proportional error model.

When you can have concentrations near the lower limit of quantitation (LLOQ) then it can also be helpful to also have some additive component to allow the model to stabilize. Else the constant CV error model will have problems as even slight misses between predicted/observed can cause issues. I am not sure how to code such a model in STAN

in nonmem it is basically

Y = IPRED*(1 + PROP_ERROR) + ADD_ERROR where the error is specified as a normal distribution with mean 0 and variance PROP_ERROR/ADD_ERROR

so it will be something like 0.2-0.3 for prop error and add_error around the value of the LLOQ (so like 1 mg/L in this case).

perhaps @Bob_Carpenter or others can provide insight for how to appropriately handle such a residual error model. (and again apologize for my lack of formal statistical specifications/equations).

  1. With regards to correlation, yes there is massive correlation between egfr and CL - thats baked into the equation for CL - thats also we use hierarchical modeling, such prognostic factors can effectively guide what the likely range of CL for a patient will be, before collecting any data.

shoot me an email at devin (dot) pastoor (at) gmail.com and we can look at figuring out some more real datasets and get the results on github where it will be easier to share and discuss the models/outputs.


stan handles well add+proportional error (as in nonmem):

Maybe it would be of interest to forum how to handle uncertain durations.



Unless I’m missing something, this should really be:

y ~ normal(mu, sqrt(sigma_add^2 + mu^2 * sigma_prop^2))

Your expression would hold if written in terms of the variance and not the standard deviation.


You are absolutely right. I just was sloppy. I still have to reconcile this
with nonmem’s Y = IPRED*(1 + PROP_ERROR) + ADD_ERROR. Some folks are using
sqrt form.


@dpastoor: Did you ever succeed in dockerizing Torsten as you mentioned above? I’ve been having a hard time trying to get it installed, and I’ll eventually be running it in a cloud environment anyhow. So if you succeeded, or if you at least got some way towards succeeding, I’d love to start where you left off!


I guess this was not addressed to me but my five cents are the the installation of Torsten should not not be too complicated. However, I found that Torsten efficiency is bit lacking since it is bit too general instead of using analytical expressions for very simple compartmental models. Let me know if you need any help. In fact I was able to port the NONMEM code above to Torsten.


Torsten’s analytical models currently include one/two cpt and linear ODE based on matrix exponential, what analytical PK models do you have in mind that we should add in the future?

BTW, I’m testing Torsten’s parallel population solvers, which will be in next release, as an alternative to stan’s map_rect. Some example models and their performance result can be found at


Yes, I have used those 1/2 cpt models but found that it is faster to use analytical equations for the specific compartment I am interested in. By looking at the code (it was quite a while ago) I saw that Torsten compute concentration for all compartments instead of just the one I was interested in. I ended up writing my own code. I spoke with Bill about this. I found Torsten to be great but somewhat inefficient in simplest cases.


Very good point. Bill and I actually talked about this, as the efficiency issue becomes more noticeable in a large pop model. NONMEN has a way to specify the cpt and I’m debating whether we should follow that.


Thanks so much for the support! I’m actually only really installing Torsten because I want access to the linear interpolation routine. If there’s a way of using that without loading up the whole of Torsten along for the ride, that would actually be more convenient for me…? I’m actually not looking to use the models - I have a slightly specialised application where I need to convolve a compartmental model with another measured curve.


Just a little update in case anyone else finds themselves here in future. I managed to get Torsten working in a docker container, which I can successfully run in a cloud instance. You can find the container here: https://hub.docker.com/r/mathesong/tidyverse-torsten . It’s not particularly well optimised and could probably do with some trimming down, since there’s a lot of extra stuff in there in order to get some of the dependencies working. But it seems to work in any case!

Hopefully this will be helpful to someone in future!