Modeling analytical error with inter/intra lot variation

linas · April 6, 2018, 3:14pm

Hi,

I am trying to model analytical error (sigmaA), inter (sigmaW) and intra (sigmaW) lot variation in pharma domain. Since intra lot variatiation is correlated with analytical error quite a bit I always get BFMI low warning for warmup=3000 and warmup=10000. Is it possible to reparametrize the model or it is OK to ignore the warning? There is no correlation with energy__ so the advice in http://mc-stan.org/misc/warnings.html#bfmi-low doesn’t help much.

Trace doesn’t look awful either.

Thanks for any advice,
Linas

weightIG1a.stan (1.5 KB)
[Please include Stan program and accompanying data if possible]

martinmodrak · April 7, 2018, 4:04pm

Hi,
sorry, can’t really help with your specific problem, but the best way to check if things are OK is to test whether you can retrieve correct parameter values from simulated data (more precisely, the true parameter values should lie about 95% of the time in the 95% posterior interval, 50% of the time in the 50% posterior interval, etc.)

Also you can try to follow some rough guidance I wrote for handling divergences (but most of it applies for low-BFMI as well): http://www.martinmodrak.cz/2018/02/19/taming-divergences-in-stan-models/

linas · April 7, 2018, 6:39pm

Hi,

Many thanks. I believe the problem is in vagueness of prior. Essentially we have identifiability problem: we want to separate intra lot variability from analytical variability. The prior for intra lot variability is weakly informative so we need to supply a quite informative prior for analytical variability. And this is confirmed by real data - when degrees of freedom in analytical variability is high I don’t get low BFMI while low df leads to low BFMI.

Thanks for the link - it is very useful. But my homework now is to collect more samples to increase df.

I just hoped that there is a way to reparametrize the model but apparently I have too little data to have informative enough prior.

Linas

martinmodrak · April 7, 2018, 6:58pm

Sorry for not being much help, I don’t really understand the domain. Just one more quick thought: if sigmaW is correlated with sigmaA, maybe you should put that in your model. This might improve inferences (or not).

linas · April 7, 2018, 7:09pm

Hi,

I don’t have an idea how to do this. I would assume that correlation is similar to sigmaA^2+sigmaW^2=unknown const

Linas

linas · April 7, 2018, 7:45pm

I also observed that I get two modes when the prior for analytical variability is not informative enough (low degrees of freedom). Perhaps this is another proof that there are identifiability problems at low df.

martinmodrak · April 7, 2018, 7:56pm

Good point. Two modes are bad news and probably mean that you need the tighter priors or respecify the model to avoid the modes (do the two modes result in identical inferences?)

As for modelling the correlation - given the non-identifiability, this is unlikely to help much, but I wrote it before seeing your previous post, so I am posting it anyways :-)

You would need to figure out how do the sigmas bind together - which requires understanding the data generating process (which I do not). Two guesses below, might not actually work…

A) if you can assume (as you suggested) that sigmaA^2+sigmaW^2=unknown const you can model this directly with something like:

parameters {
  simplex[2] sigmas_raw;
  real<lower=0> sigmas_sum;
}

transformed data {
  real sigmaA = sigmas_raw[1] * sigmas_sum;
  real sigmaW = sigmas_raw[2] * sigmas_sum;
}

+ you would need a prior on sigmas_sum

B) Model the correlation explicitly (more of a pseudocode - I’ve never done this and don’t understand it deeply, it’s just something that could work):

real sigmaA_mean ~ [original prior for sigmaA];
real sigmaW_mean ~ [original prior for sigmaW];
cov_matrix sigma_cov ~ inv_wishart( [expected correlation] );
vector[2] sigmas ~ multi_normal({sigmaA, sigmaW}, sigma_cov);

linas · April 7, 2018, 10:40pm

It is only my guess about nonidentifiability - I am not an expert.

I think I will collect more samples to make prior for sigmaA more informative and will compare with two approaches you have suggested. It is very possible that all of those will provide nearly identical results but from physical perspective more samples should be more bullet proof.

Bob_Carpenter · April 13, 2018, 8:57pm

Additive non-identifiability shows up as thin lines in the pairs plots and multiplicative non-identifiability as tight banana shapes.

linas · April 13, 2018, 10:13pm

It looks we should have multiplicative non-identifiability between sigmaA and sigmaW. It is not evident from the model. Anyway, informative prior helps to resolve it.

Topic		Replies	Views
Identification in latent variable model Modeling	5	1494	April 17, 2018
Low E-BFMI: Reparameterization causing problems (low ESS, high R-hat) Modeling techniques , fitting-issues , specification	0	439	October 2, 2022
Problematic Sampling with Model Modeling fitting-issues , specification	7	609	October 4, 2019
Fixing "BFMI was low" problems Modeling fitting-issues , performance	11	3464	December 21, 2022
Lower BFMI but seems to be a better fit? Modeling fitting-issues , specification	4	428	October 30, 2023

Modeling analytical error with inter/intra lot variation

Related Topics