Mixture growth model with non-constant sigma for qPCR data

t-pollington · November 15, 2022, 1:38pm

My growth model estimates random intercepts & random gradients for each individual, longitudinally in time. Its current specification is working. I would now like to make the intercept term belong to a 2-component Normal mixture distribution.

1) Current growth model (working)

The response is y and the growth term is time. Censoring is also included. The ID1 term allows correlation between parameters. The |SUBJ term is the hierarchical bit of the model and accounts for repeated measures by individual SUBJ. The traditional constant sigma term (i.e. sigma ~ 1) is replaced with one depending on a (non-linear) exponential of the response using extra parameters sigmaA, sigmaB & sigmaC.

get_prior(data = dataset,
          family = gaussian(),
          formula = bf(y | cens(cen1) ~ popnintercept + time) +
            lf(popnintercept + time ~ 1 + (1|ID1|SUBJ)) +
            nlf(sigma ~ sigmaA * exp(sigmaB * y) + sigmaC) +
            lf(sigmaA + sigmaB + sigmaC ~ 1) + 
            set_nl(nl = TRUE) 
)

This produces the following terms which I’ve rewritten here to help my understanding. Note they don’t match the exact brms syntax: cor, popnintercept(b), popnintercept(sd), sigmaA(b), sigmaB(b), sigmaC(b), time(b), time(sd).

2) First attempt (successful)

I have managed to split the intercept into mu1 & mu2:

mix = mixture(gaussian, gaussian)
get_prior(data = dataset,
          family = mix,
          formula = bf(y | cens(cen1) ~ 1 + time +
                                    (1 + time|ID1|SUBJ))
)

It now contains cor, sigma1, sigma2, theta, mu1(Intercept), mu1(sd), mu2(Intercept), mu2(sd), time_mu1(b), time_mu1(sd), time_mu2(b), time_mu2(sd).

3) Non-constant `sigma1` & `sigma2` (unsuccessful)

Building on 2) above:

get_prior(data = dataset,
          family = mix,
          formula = bf(y | cens(cen1) ~ 1 + time +
                                  (1 + time|ID1|SUBJ)) +
            nlf(sigma1 ~ sigmaA1 * exp(sigmaB1 *y) + sigmaC1) +
            lf(sigmaA1 + sigmaB1 + sigmaC1 ~ 1) +
            nlf(sigma2 ~ sigmaA2 * exp(sigmaB2 * y) + sigmaC2) +
            lf(sigmaA2 + sigmaB2 + sigmaC2 ~ 1) +
            set_nl(nl = TRUE)
)

But it gave Error: The following variables can neither be found in 'data' nor in 'data2': 'ID1'.

Is it possible that brms can’t cater for this specification yet?

Operating System: 5.10.16.3-microsoft-standard-WSL2
brms Version: 2.18.0

ldeschamps · November 15, 2022, 3:35pm

Hello :)

I have no specific experience with this time of model, but something bugs me :

Is it normal that your sigmas (sd of the mixture components) depends upon your response variable (y)? I do not really understand what kind of result it could produce!

All the best,
Lucas

t-pollington · November 15, 2022, 3:54pm

Hi @ldeschamps, a good question and here is some context. The response y is actually the C_t (or C_q) value obtained during a qPCR assay. It is widely accepted that the precision of the measurement gets worse exponentially as the C_t value increases: this explains why the sigma has been made to be response-dependent. This has been validated in previous models that used a variable sigma to give better fitting models (lower LOOIC).

My reasoning for then having two different sigma is because I believe there are two separate processes—one generating the qPCR readings from truly-infected people at lower C_t, and another generating (different) background noise at higher C_t, therefore I wanted to account for this by using sigma1 & sigma2.

ldeschamps · November 15, 2022, 4:16pm

Oh, thanks for the explainations! This is really interesting.

This remind me the mean-variance functions of glms and other non normal models. I guess, in your case the mean-variance function varies among datasets, and it would not be satisfying to work with a mixture of families with defined mean-variance function?

Sorry, I bring a lot of question but I don’t know a technical solution to your problem :(

All the best,
Lucas

t-pollington · November 15, 2022, 4:40pm

I have tried with a function for sigma(y; sigma1, sigma2, sigma3) informed with exact values (not estimating the sigma1, sigma2, sigma3 parameters) from earlier qPCR protocol papers, but have found that it gives models that don’t fit as well as when I use learn sigma1, sigma2, sigma3 from the data.

I guess this isn’t surprising considering that those applying the qPCR protocol to their study will do it with different lab conditions and probably suboptimally to the reference lab.

Topic		Replies	Views
Specifying my mixture model brms specification	3	498	October 8, 2020
Example growth mixture model won't converge brms fitting-issues	20	1998	October 29, 2020
Specifying conditional random intercept brms specification , brms	4	706	February 17, 2021
Advice on speeding up non-linear multi-level in brms? brms mixed-model , hierarchical-model	2	544	March 21, 2023
Translating brms model into equation brms specification	5	1565	February 11, 2021

Mixture growth model with non-constant sigma for qPCR data

1) Current growth model (working)

2) First attempt (successful)

3) Non-constant sigma1 & sigma2 (unsuccessful)

Related topics

3) Non-constant `sigma1` & `sigma2` (unsuccessful)