I’m working with pipeline crack depths measured by an inspection tool. The tool error is normally distributed with a nonzero bias, and it cannot report a value <= 0 since that doesn’t make sense. As such, the tool measurements follow a truncated normal distribution.
The minimal example below (written in R) attempts to recover the bias and SD of the error distribution. It works reasonably well for 500 cracks, but for 15 cracks, it complains about divergent transitions after warmup. I’ve reparameterized other models following the Neal’s Funnel example before, but am not sure how to go about it here.
Interestingly, the same model without the T[-trueVal[i],] truncation does not complain about divergent transitions, but it overestimates the bias and underestimates the SD as expected.
Attached is the accompanying pairs plot, which shows there is a problem, but again I’m not sure how to solve it.
Any help is appreciated.
Thanks!
rm(list = ls())
library(rstan)
library(truncnorm)
measBias <- 0.2
measSD <- 1.2
N <- 15
modelStr <- "
data {
int<lower=0> N;
real measVal[N];
real trueVal[N];
}
parameters {
real measBias;
real<lower=0> measSD;
}
model {
real error[N];
for (i in 1:N) {
error[i] = measVal[i] - trueVal[i];
error[i] ~ normal(measBias, measSD) T[-trueVal[i],];
}
}
"
stanModel <- stan_model(model_code = modelStr)
trueVal <- qlnorm(p = (1:N)/(N+1), meanlog = -0.112, sdlog = 0.472) # equivalent to mean = 1, SD = 0.5
measVal <- rtruncnorm(n = 1, a = 0, mean = trueVal + measBias, sd = measSD)
stanData <- list(N = N, trueVal = trueVal, measVal = measVal)
result <- sampling(stanModel, data = stanData, iter = 500,
chains = 1, cores = 1)
I think that the problem might actually be (at least partly) due to the fact that you don’t have a prior on measBias and measSD - which is likely to manifest only when you have too little data. Try figuring out a weakly informative prior (as in https://github.com/stan-dev/stan/wiki/Prior-Choice-Recommendations)
Also note that you can move the computation of error to the transformed data block to save computing time.
Unfortunately this wouldn’t work for my situation because (1) it doesn’t account for the truncation and (2) each measurement has its own lower bound truncation value. The only way I know how to model this is with the for loop. I’m definitely open to other ways of doing this.
Not sure what you mean - care to elaborate? AFAIK your code sample uses the untruncated normal distribution - unless I am missing something about the way Stan works…
Ah, I see where the misunderstanding is - your approach is correct when the truncated distribution relates to parameters - then indeed you need parameter bounds and the truncation is implicit. But in the case here, it is data that is distributed according to a truncated distribution and this requires explicit truncation.
That’s true in 2.17 and isn’t going to change in 2.18. I know how to do it, but just haven’t had time to run it through the language.
The explicit truncation is required when the distribution being truncated involves parameters. Otherwise, the normalization for the truncation is constant and can be dropped. So usually if it’s data distributed according to some parameters, you need the truncation as @martinmodrak mentioned. Similarly, if it’s a prior with fixed (data) parameters, you don’t need the truncation. If everything’s variable, you need the truncation, too.
You mean the tool error looks like a half-normal or a normal with mean other than 0 that’s truncated?
Have you tried increasing the target acceptance rate (adapt_delta)? If the divergences are minor numerical errors, that can get rid of them.
It can be more efficient to vectorize the density and separately vectorize the truncation points. The material in the manual on truncated and censored distribution goes over the math, but doesn’t show how to do this ad-hoc vectorization.
The definitions for CDFs and CCDFs is:
CDF: Pr[X <= x]
CCDF: Pr[X > x]
I would assume that for a lower truncation L we would need to subtract CDF: Pr[X <= x]. Can you confirm that CCDF is the correct use for lower truncation?
[EDIT] I indeed verified on a dummy model that - CCDF is right for lower bound truncation (although I am still puzzled)