# Specifying likelihood for bounded measurement errors model

I am developing a measurement errors model to calibrate a device that measures the depth of cracks and other defects in a thin surface. Two sets of measurements are performed on the same surface, each by a different device. One device (x) is cheap and fast, but has some inherent bias in addition to random noise. Measurements from the second device (y) are assumed to have random error but no bias. At the moment I have a relatively simple linear regression measurement errors model implemented in Stan as shown below.

With a Student likelihood, the model works fairly well for defects whose depths are not near 0 or the thickness of the surface, but unsurprisingly gives unrealistic predictive distributions at the extremes, as shown in the plot below. This example has 94 data points and was fit with

fit = stan(file=modelpath, data=stan_dict, iter=20000, chains=4, control=list(adapt_delta=0.99))

However, even with the high warmup and acceptance rate, in some cases I still get a dozen or so divergence errors. My questions:

1. What would be a suitable way of accurately modeling defects with depths near 0 or the surface thickness? I am considering a JohnsonSB distribution with support over the thickness, but I don’t know how to code this support in Stan, i.e., the interval [xi, xi+lambda] where xi = b0 + b1*x ≥ 0 and xi+lambda ≤ thickness. The JohnsonSB code I’ve implemented so far will run, but doesn’t produce anything useful.

2. There are many x-measurements with the same value due to rounding, in this case to the nearest 0.035; these are not repeated measurements of the same defect. I know that technically I should probably use a rounding model as described in the Stan manual, but before going through it effort I’m wondering if this is likely to improve inference over my current implementation?

3. Have I coded the generated quantities section correctly to predict the true defect depth given an observed x?

``````functions {
// negative log PDF of JohnsonSB distribution
real johnsonsb_lpdf(real x, real xi, real lam, real gamma, real delta) {
// xi: location parameter
// lambda: scale parameter >0
// gamma: shape parameter
// delta: shape parameter >0

real z;
z = ((x - xi) / lam);
return log(delta) - log(lam * sqrt(2*pi())*(z-z^2)) - 0.5*(gamma + delta * log(z/(1-z)))^2;
}

real johnsonsb_rng(real xi, real lam, real gamma, real delta) {
real u;
real y;
u = uniform_rng(0, 1);
y = exp((inv_Phi(u)-gamma)/delta);
return lam * y / (1+y) + xi;
}
}

data {
int<lower=0> N;	//	number of observations
vector[N] x_obs; // x-measurement, observed with error
vector[N] y;	// y-measurement
real<lower=0> b0_sd; // SD of intercept
real<lower=0> ub;	// upper bound of measurements

int method;	// 0 for Normal likelihodd, 1 for Student, 2 for JohnsonSU

// Data for prediction
int<lower=0> N_hat;
vector[N_hat] x_hat; // new observations
}

parameters {
// Latent model
real<lower=0> mu_x;
real<lower=0> sigma_x;
vector<lower=0>[N] x;	// unknown true value
real<lower=0> tau;	// measurement error

// Regression model
real b0;
real<lower=0> b1;
real<lower=0> sigma_y;

// Student likelihood
real<lower=0> nu;

// JohnsonSB likelihood (using sigma_y for lambda)
real js_gamma;	// shape parameter
real<lower=0> js_delta;	// shape parameter
}

model {
// Latent model
tau ~ gamma(1, 30);		// measurement error SD (expected value equal to approx vendor claim of 0.033)
x ~ normal(mu_x, sigma_x);	// prior for true x dist
x_obs ~ normal(x, tau);		// measurement model

// Regression model
b0 ~ normal(0, b0_sd);
b1 ~ normal(1, 3);

// Student dof prior
nu ~ gamma(2, 0.1);

// y-scale prior
sigma_y ~ gamma(2, 40);

// JohnsonSU priors
js_gamma ~ normal(1, 2);
js_delta ~ gamma(2, 1);

for (n in 1:N) {
if (method==0)
y[n] ~ normal(b0 + b1*x[n], sigma_y);
else if (method==1)
y[n] ~ student_t(nu, b0 + b1*x[n], sigma_y);
else
y[n] ~ johnsonsb(b0 + b1*x[n], sigma_y, js_gamma, js_delta);
}
}

generated quantities {
vector[N] y_rep;	// PPC
vector[N_hat] y_hat;	// new predictions
for (n in 1:N) {
if (method==0)
y_rep[n] = normal_rng(b0 + b1*x[n], sigma_y);
else if (method==1)
y_rep[n] = student_t_rng(nu, b0 + b1*x[n], sigma_y);
else
y_rep[n] = johnsonsb_rng(b0 + b1*x[n], sigma_y, js_gamma, js_delta);
}

for (n in 1:N_hat) {
if (method==0)
y_hat[n] = normal_rng(b0 + b1*x_hat[n], sigma_y);
else if (method==1)
y_hat[n] = student_t_rng(nu, b0 + b1*x_hat[n], sigma_y);
else
y_hat[n] = johnsonsb_rng(b0 + b1*x_hat[n], sigma_y, js_gamma, js_delta);
}
}
``````

Many thanks!

This one’s complicated. Long questions like this are much harder to answer than sequences of smaller ones.

You just need to reduce this to lower and upper-bound constraints on the parameters in a coherent sequence. And if you have a bunch of variables that need to satisfy the constraint, take the extreme ones.

Never heard of this one.

Probably not unless the amount of rounding is large compared to the value.

That looks like it matches your model. I didn’t spend a huge amount of time verifying, but you’re using the `_rng` forms, which is the critical part that people often miss.

There are a lot of pieces of your code that can be sped up. For instance, `log(z / (1 - z)) == logit(z)`.

Unless you’re using them in a mixture, yo8u can probably drop constants like `sqrt(2 * pi())`.

Rather than generating `u = uniform_rng(0, 1)` and then applying `inv_Phi(u)`, which is very inefficient, I’d suggest just using `normal_rng(0, 1)`, since it will have the smae distribution as `inv_Phi(uniform_rng(0, 1))`.

Computing `b0 + b1 * x` all in one go and then indexing it will be much more efficient. As would collapsing the `y` based on method so you can vectorize.