Out-of-sample cross validation for response variable with measurement error

Challen_Hyman · August 28, 2024, 2:58pm

I would ideally like to use OOS-CV to test the predictive power of a model by comparing it’s posterior predictive distributions to withheld values, but the response values have measurement error associated with them, so a direct comparison of the mean withheld estimate to the posterior predictive intervals doesn’t seem to completely address the problem. I checked a couple FAQ pages, but still haven’t found anything satisfying. I’m sure there is a method for addressing this, but if someone could point me in that direction, I would be grateful.

avehtari · August 29, 2024, 7:31am

Can you post the model description and your Stan model code, so I might be able to provide more concrete answer?

js592 · August 29, 2024, 5:37pm

Do the held out values have a different measurement error process (i.e. observational model) than the training data?

Challen_Hyman · September 3, 2024, 1:50pm

Certainly, here is my code. The model is a gamma model with mean mu and shape sigma. I modeled the shape parameter directly as a function of seasonal covariates. It’s a fairly vanilla GLM without many bells or whistles, but the observations come with an estimate of observation error which I included as measurement error in the model. I’ve worked with withheld estimates in the past, but it’s usually a withheld point estimate, and I like to assess predictive power (in addition to IC’s) by checking if the posterior predictive distributions of the withheld values contain the withheld values at approximately the proportion of the nominal credible interval (I know this is a frequentist perspective). I just don’t know how to perform that procedure when each withheld observation is not known with certainty.

data {
  int<lower=0> T;                                                               // Total samples : num Months x num Years
  int<lower=0> P;                                                               // Total number of Effort mean (mu) predictors
  int<lower=0> Q;                                                               // Total number of Effort uncertainty (sigma) predictors
  vector[T] Effort;                                                             // Observed effort (angler-trips) 
  vector[T] sigma_Effort;                                                       // Observed effort variance
  matrix[T, P] X;                                                               // Effort mean design matrix
  matrix[T, Q] Z;                                                               // Effort uncertainty design matrix
}

parameters {
  vector[P] beta;                                                               // Coefficients for mu
  vector[Q] rho;                                                                // Coefficient for sigma (log-scale)
  vector<lower=0>[T] Effort_hat;                                                // Vector of estimated 'true' effort
}

transformed parameters {
  vector[T] alpha;                                                              // Effort inverse-scale
  vector[T] mu;                                                                 // Effort mean 
  vector[T] sigma;                                                              // Effort shape
  for (t in 1:T){
    mu[t] = exp(X[t,]*beta);
    sigma[t] = exp(Z[t,]*rho);
    alpha[t] = sigma[t]/mu[t];
  }
}

model {
  for (t in 1:T){
    Effort_hat[t] ~ gamma(sigma[t], alpha[t]);
    Effort[t] ~ normal(Effort_hat[t], sigma_Effort[t]);                         // Assume observed value is an imprecise but unbiased estimate of true value
  }
  // Priors
  beta ~ normal(0,2);
  rho ~ normal(0,2);
}

generated quantities {
  vector[T] pred_Effort;
  for (t in 1:T){
    pred_Effort[t] = gamma_rng(sigma[t], alpha[t]);
  }
}

Challen_Hyman · September 3, 2024, 1:52pm

The withheld observations would conceivably have the same measurement error process. Each observation comes with its own observation variance as estimated using the same stratified random sampling design.

avehtari · September 11, 2024, 11:00am

The log_lik computation in the generated quantities block would be

  vector log_lik[T];
  for (t in 1:T){
    log_lik[T] = normal_lpdf(Effort[t], Effort_hat[t], sigma_Effort[t]);                           }

However, as each observation has its own parameter Effort_hat[t] it’s likely that PSIS-LOO would fail. See Roaches case study for how to do integrated-LOO to get reliable results.

Challen_Hyman · September 12, 2024, 5:36pm

Thanks!

Topic		Replies	Views
Comparing models with and without measurement error Modeling loo	2	421	September 4, 2020
How to model this measurement error problem best? Modeling techniques , specification	9	782	September 20, 2021
Resolving fully bayesian uncertainty quantification and LOO cross validation Modeling	1	356	August 24, 2021
How to compare the estimation results of least square method and Stan model Modeling	4	480	December 28, 2020
Measurement error for predictors in GLM Modeling techniques , performance	3	971	September 19, 2017

Out-of-sample cross validation for response variable with measurement error

Related topics