Loo: calculating point-wise log-lik when using data augmentation for censored observations

LucC · June 10, 2021, 9:17pm

Hi Stanimals,

I have a bit of a conceptual question that I’m thinking of before endeavoring on using the loo package. Briefly, I’m trying to figure out how to specify the point-wise log-likelihoods in the generated quantities block (to be analysed post-hoc with the loo package in R) when my model includes data augmentation for censored observations. More details below.

I’m working with a dataset where a subset of observations are left-censored. More specifically, I (assume that I) know the observation is under some threshold concentration (a detection limit of a lab assay) but I don’t know where between zero and that threshold. My model does however predict a specific concentration. For computational efficiency, I use data augmentation to model left-censored observations, meaning that for each censored observation, I define a parameter y_log_true in the parameter block that is bounded to be <=log(threshold). Then, in the model block, I include a statement y_log_true ~ normal(y_log_hat, sigma);, where y_log_hat is the model-predicted concentration for that data-point and sigma is the measurement error of the lab assay. Works like a charm and I find it is more computationally efficient than integrating out the observation with a CDF.

Now for calculating the point-wise log-likelihoods (to be used with loo later on), for censored data points, should I calculate the log-likelihood as either:

the log-probability of the observation being censored, given expectation y_log_hat and the measurement error sigma (using the CDF as one would do in the model block when integrating out censored observations), or
the log-probability density equivalent to the sampling statement in the model block y_log_true ~ normal(y_log_hat, sigma); (including normalising constants, of course)?

Cheers!

LucC · June 10, 2021, 9:21pm

The model particulars are described in another post here, although I don’t think they are particularly relevant to the question at hand.

The only thing I can think of is that for some censored data points, the log-lik will be zero or very close to zero (probability 1.0) for most if not all samples. Not sure whether that will be a problem for the loo procedure, but I guess I’ll find out.

LucC · June 16, 2021, 10:08am

BUMP! Suggestions anyone? @avehtari ?

avehtari · June 17, 2021, 8:45am

How do you measure the efficiency? I believe this can be the case, but I’m also interested in learning examples where this would happen.

The first one is likely to provide more stable PSIS-LOO computation.

LucC · June 23, 2021, 10:12am

Thanks @avehtari.

Also thanks for inviting me to look into the efficiency matter more thoroughly. I ran my model with and without data augmentation on the same data set (4 chains, 500 warm-up, 500 sampling). As it turns out, the computational efficiency in terms of n_eff per second was very similar between the two approaches. To calculate the efficiency, I took the average of the n_eff of all 120 parameters in my model, and divided that either by the sum of chain times or the net time that stan needed to complete the job:

	DATA AUGMENTATION OF CENSORED OBSERVATIONS	INTEGRATING OUT CENSORED OBSERVATIONS WITH NORMAL CDF
Average n_eff	1588	1761
chain 1 time	123.00	142.00
chain 2 time	130.00	152.00
chain 3 time	136.00	154.00
chain 4 time	177.00	168.00
net time (running chains in parallel + processing)	188.00	178.00

Average efficiency (n_eff/sec)
Based on sum of chain times	2.81	2.86
Based on net stan time (parellel chains)	8.45	9.90

Interesting note: the approach based on integrating out censored observations was more prone to negative infinity log-prob at initialisation. I think this may have been part of my original motivation to use data augmentation (can’t remember, honestly).

[EDIT: just made the table somewhat prettier]

avehtari · June 23, 2021, 12:08pm

Thanks for the report. Sometimes the efficiency of Stan dynamic HMC is surprising (this is not the only case). Can you also check the number of log density evaluations in each case? That is, is it possible that normal_cdf happens to have that much higher computational cost that reduced number of log density evaluations doesn’t help?

LucC · June 23, 2021, 3:58pm

Happy to! How do I extract this from the stanfit object? :D

avehtari · June 23, 2021, 7:54pm

Use get_num_leapfrog_per_iteration(fit)

See more at Check HMC diagnostics after sampling — check_hmc_diagnostics • rstan

LucC · June 23, 2021, 9:32pm

I re-ran both analysis (data augmentation and integrating out with cdf), but now with 8 chains each as I was under the impression that there was quite a lot of variation in computation time between chains. Turns out that my previous run of four chains each was a bit of a lucky hit. Integrating out seems to be more efficient after all.

	DATA AUGMENTATION OF CENSORED OBSERVATIONS	INTEGRATING OUT CENSORED OBSERVATIONS WITH NORMAL CDF
Average n_eff	3138	3414
chain 1 time	182.00	153.00
chain 2 time	187.00	164.00
chain 3 time	191.00	167.00
chain 4 time	200.00	173.00
chain 5 time	209.00	175.00
chain 6 time	218.00	178.00
chain 7 time	239.00	188.00
chain 8 time	267.00	205.00
net time (running chains in parallel + processing)	278.00	217.00
Average efficiency (n_eff/sec)
Based on sum of chain times	1.85	2.43
Based on net stan time (parellel chains)	11.29	15.73

Number of leapfrog steps per iteration (freq distr)
63	1675	2945
127	2314	1053
191	8	2
255	2	0

[EDIT: cleaned up table]

LucC · June 23, 2021, 9:48pm

The difference is even more pronounced (in favour of integrating out) if the chains are initialised in similar areas of the posterior:

	DATA AUGMENTATION OF CENSORED OBSERVATIONS	INTEGRATING OUT CENSORED OBSERVATIONS WITH NORMAL CDF
Average n_eff	3070	3485

chain 1 time	188.00	141.00
chain 2 time	195.00	149.00
chain 3 time	197.00	163.00
chain 4 time	205.00	164.00
chain 5 time	209.00	170.00
chain 6 time	213.00	171.00
chain 7 time	215.00	172.00
chain 8 time	234.00	176.00
net time (running chains in parallel + processing)	244.00	188.00

Average efficiency (n_eff/sec)
Based on sum of chain times	1.85	2.67
Based on net stan time (parellel chains)	12.58	18.54

Number of leapfrog steps per iteration (freq distr)
63	2875	3207
127	1118	791
191	5	2
255	2	0

avehtari · June 24, 2021, 11:56am

Ah, so the difference you had in the beginning might have been the difference in warmup time? Anyway, the conclusion now is that integrating out is better as we would expect a priori. Integrating out makes the posterior easier to sample which leads to slightly less leapfrog steps per iteration. It is also possible that reduced memory usage improves the running time.

Topic		Replies	Views
Psis-loo with censored observations Modeling	5	644	July 8, 2025
Calculate pointwise log-likelihood in multilevel models with measurement error and identifiability hacks Modeling techniques , specification , loo	3	470	July 20, 2023
Point-wise log-likelihood calculation for CJS with logistic predictor Modeling capture-recapture	2	349	September 28, 2023
Log_lik for [estimated] missing data? General	5	977	July 11, 2017
Pointwise log-likelihood for Mark-Recapture Model Modeling specification , ecology , capture-recapture	1	907	April 28, 2019

Loo: calculating point-wise log-lik when using data augmentation for censored observations

Related topics