Hi Stanimals,

I have a bit of a conceptual question that I’m thinking of before endeavoring on using the loo package. Briefly, I’m trying to figure out how to specify the point-wise log-likelihoods in the generated quantities block (to be analysed post-hoc with the `loo`

package in R) when my model includes data augmentation for censored observations. More details below.

I’m working with a dataset where a subset of observations are left-censored. More specifically, I (assume that I) know the observation is under some threshold concentration (a detection limit of a lab assay) but I don’t know where between zero and that threshold. My model does however predict a specific concentration. For computational efficiency, I use data augmentation to model left-censored observations, meaning that for each censored observation, I define a parameter `y_log_true`

in the parameter block that is bounded to be `<=log(threshold)`

. Then, in the model block, I include a statement `y_log_true ~ normal(y_log_hat, sigma);`

, where `y_log_hat`

is the model-predicted concentration for that data-point and `sigma`

is the measurement error of the lab assay. Works like a charm and I find it is more computationally efficient than integrating out the observation with a CDF.

Now for calculating the point-wise log-likelihoods (to be used with `loo`

later on), for censored data points, should I calculate the log-likelihood as either:

- the log-probability of the observation being censored, given expectation
`y_log_hat`

and the measurement error`sigma`

(using the CDF as one would do in the model block when integrating out censored observations), or - the log-probability density equivalent to the sampling statement in the model block
`y_log_true ~ normal(y_log_hat, sigma);`

(including normalising constants, of course)?

Cheers!