I have truncated data (lower bound = 2) and therefore want to use a truncated normal likelihood in my model. I only found out how to do this if you use the sampling notation (y ~ normal(mu, sigma)): Truncated or Censored Data. However, I am using the log probability incrementation expression (target += normal_lpdf(y | mu, sigma)) in my model. Does anyone know how to use a truncated normal in this case?
You can just define y as a lower bounded data type and continue to use normal_lpdf(y|mu,sigma). The difference in the log posterior would be up until an additive constant (based on the truncation) and hence would not effect sampling.
The equivalent notation for your example but using the target notation, you would divide by the permissible portion of the distribution. So for your example, normal_lcdf(2 | mu, sigma) would be the distribution up to 2 (negative values), which you don’t want, so divide by the complement (subtract on the log scale), something like so:
Edit: fixed the cutoff to 2 as Bob corrected below. Need to read closer. And as Garren explains, you won’t find a difference if mu and sigma are not parameters, but will need this if either are, as Bob points out below.
Where we define C=\int_{2}^{\infty}{\mathcal{N}(y|\mu,\sigma^2) dy} since it is finite. Here C can be grouped together with the unknown normalization constant. Hence the truncated normal is equivalent to the normal distribution with support for y>2. The only time when we would include the truncation is if the upper or lower bound of the truncation is dependent on other variables, not when these are constant.
Thanks @Garren_Hermanus. This is true if mu and sigma are constants, because then normal_lccdf is also a constant. If either of mu or sigma are parameters, you need to also include the truncation adjustment:
target += -normal_lccdf(2 | mu, sigma);
This is close to what @ssp3nc3r wrote, but uses 2 as the lower bound as requested by @Loni92.
Just be sure to also declare y with <lower=2> for error checking.
I’m unmarking the solution until @Loni92 clarifies whether mu and sigma are data.
By “data” I mean something whose value is known and which is declared in the data (or transformed data) block of a Stan program. If either of mu or sigma are parameters, then you need the explicit truncation adjustment.