Replicated data



In BDA3 appendix there is a discussion about replicated data in the EXISTING schools and replicated data in NEW schools. Currently I am working with PBPK model. Some parameters have literature values (such as gastric transit times) but some don’t (such as kidney km or vmax). What I found that when tight priors are set on gastric transit times and kidney km/vmax then predictive posterior based on replicated data for EXISTING patients fits very well the observed data (plasma concentration/AUC/tmax/Cmax/fraction absorbed) while predictive posterior based on replicated data for NEW patients is quite off (very large uncertainty). However, when I use tight priors on gastric transit times and vague priors for kidney km/vmax then predictive posterior for EXISTING patients is still perfect while predictive posterior for NEW patients envelops well the actual data. Only trouble is that those kidney km/vmax median values are way beyond realistic values. I am curious if anybody had similar experiences (maybe in different domain) and can we justify the non realistic values for kidney vamx/km. My explanation is that all models are wrong but some are useful and the one with vague priors on kidney parameters have good fit and good predictive capability so those kidney parameters while having mechanistic interpretation become fudge factors like in fitting dissolution profile to Weibull equation.

Many thanks for sharing your experiences.



Why do you tink the prediction for new patients is off? The only way to measure that is with simulated data. Predictions for new patients (or in general new groups in a hierarchical model) involves three kinds of variability: estimation variability, variability in the patients, then sampling variation.

P.S. I’m still having trouble tracking Discourse topics, so I occassionally go back and try to clear out ones I missed the first time around (if anyone knows how to set the “new” to include everythng that’s come out that I haven’t read, it’d be an enormous help).


Well, I will restate the problem little bit. There are two parameters: theta1 & theta2. When priors are tight for both parameters prediction for existing patients is good but prediction for new patient simulated from parameter estimates has a very high uncertainty (wide CI). Please note that theta2 was hard bounded. When prior is tight for theta1 but loose for theta2 (and no bounds) predictions for existing patients and new patient simulated from parameter estimates are good (CI is narrow). theta1 parameter prior is taken from literature (the were some experiments done to determine the point estimate). theta2 parameter has no values documented I guess because it is very hard to set an experiment but clinicians expect it to be positive and not too large - such as km parameter in Michaelis-Menten equation. Thus I wonder if this observation has some generality, i.e. some parameters which have physical meaning may have strange values to compensate for the processes that are too complicate to model such as MM elimination from the body.


Kennedy and O’Hagan, and others have discussed the problem of inference for parameters having a physical interpretation in case of model misspecification. See and references there in.