Mixtures with censoring?

mike-lawrence · August 3, 2017, 4:27pm

In experiments on human cognition using reaction time (RT) as an outcome measure, we often have a procedure-enforced censoring of observations such that the experiment interrupts data collection if a given response is taking longer than, say, 1s, yielding an NA as data for that instance. (There’s also an implied lower bound at zero, but where RTs are usually skewed anyway, models often use log-normal error or a more elaborate latent process like Wiener diffusion that yields a similarly skewed distribution of error.)

So we might use a right-censored model along the lines of the following (adapted from the manual v2.16, pg 188):

data {
    int<lower=0> N_obs;
    int<lower=0> N_cens;
    real y_obs[N_obs]; 
    real<lower=max(y_obs)> U;
}
parameters {
    real mu;
    real<lower=0> sigma;
}
model {
    y_obs ~ lognormal(mu, sigma);
    target += N_cens * lognormal_lccdf(U | mu, sigma);
}

In these experiments we also observe varying degrees of “contamination” responses, reflecting usually either fast guessing or slow responses after a brief inattentive period. There are more complicated models of these contamination sources that become identified with lots of data, but a useful first approximation to merely “deal with” them in cases of more moderate amounts of data is to model responses as a mixture with some proportion, p, of log-normally distributed “non-contamination” responses and 1-p proportion uniformly distributed “contamination” responses.

In this case, would the following be the appropriate expression of the mixture with censoring?

data {
    int<lower=0> N_obs;
    int<lower=0> N_cens;
    real y_obs[N_obs]; 
    real<lower=max(y_obs)> U; 
}
transformed_data{
    real uniflpdf;
    uniflpdf = uniform_lpdf(0|0,U); // lpdf of the contaminations (data-invariant)
}
parameters {
    real mu;
    real<lower=0> sigma;
    real<lower=0,upper=1> mix_prop ;
}
model {
    target += log_sum_exp(
          log(mix_prop) + lognormal_lpdf( y_obs | mu , sigma)
        , log1m(mix_prop) + uniflpdf
    ) ;
    target += N_cens * lognormal_lccdf(U | mu, sigma);
}

Bob_Carpenter · August 6, 2017, 9:29pm

You can just use

target += log_mix(mix_prop, lognormal_lpdf(y_obs | mu , sigma), uniflpdf);

and the declare-define lets you do

real uniflpdf = uniform_lpdf(0 | 0, U);

Otherwise, the model looks OK.

mike-lawrence · August 8, 2017, 10:08am

Thanks!

frank_hezemans · March 11, 2022, 7:53pm

Apologies for reviving a very old topic @mike-lawrence , but I’m currently working on closely related modelling approaches, and I’d like to discuss one aspect of your code to make sure I’m not missing something.

You modelled the censored observations as follows:

target += N_cens * lognormal_lccdf(U | mu, sigma);

However, should this not be adjusted for the probability mix_prop that the log-normal process is actually triggered? Perhaps something like the following?

target += N_cens * (log(mix_prop) + lognormal_lccdf(U | mu, sigma));

Topic		Replies	Views
A mixture Gaussian model with censoring data Modeling	5	480	August 5, 2019
Unable to recover mixture between Wiener-diffusion and uniform distribution Modeling fitting-issues , cognitive-science	14	3048	February 17, 2021
Aggregated model for mixture of Exponential distributions Modeling mixture , poisson	5	77	March 18, 2025
Divergence and Bad Start Values in Censored Mixture Model Modeling fitting-issues	2	713	April 14, 2019
Prior predictive checks in survival modeling with a mixture prior Modeling rstan , mixture , prior-predictive , survival	1	30	December 31, 2024

Mixtures with censoring?

Related topics