Ordinal model with category inflation not recovering parameters

I’ve been working with ordinal regression models that have category inflation, i.e. increased responses for certain categories beyond the normal response process. For instance, in some surveys, participants might choose the first value of the ordinal response scale to rush through the survey quickly, leading to inflated category 1 values.

I’ve recovered the parameters for more involved models, but the simple case below has been tripping me up.

The likelihood function is:

P( y | \kappa, \boldsymbol{\theta}) = \begin{cases} \kappa + (1-\kappa) \text{Categorical}(1 | \boldsymbol{\theta}) & \text{if } y = 1 \\ (1-\kappa)\text{Categorical}(y | \boldsymbol{\theta}) & \text{if } 1 < y \leq 3 \end{cases}

where y = {1,2,3} is the ordinal data, \kappa is the probability of drawing a category 1 from the inflation component (false 1), and \boldsymbol{\theta} is the vector of ordinal category probabilities for the true process.

I simulate data as:

set.seed(2020)
N <- 1000
# mean of latent scale generating the ordinal data
alpha <- 1
# sd of latent scale generating the ordinal data
sigma <- 1
# inflation probability
kappa <- 0.1

# simulate some inflated responses
is_inflated <- rbinom(N, 1, prob = kappa)

# probabilities of ordinal categories from the true process
theta <- rep(NA, 3)
theta[1] <- pnorm( (1.5 - alpha)/sigma )
theta[2] <- pnorm( (2.5 - alpha)/sigma ) - pnorm( (1.5 - alpha)/sigma )
theta[3] <- 1 - pnorm( (2.5 - alpha)/sigma )

# simulate the data
y <- rep(NA, N)
for(i in 1:N){
 # if inflated, y == 1
  if(is_inflated[i]){
    y[i] <- 1
  }
  if(!is_inflated[i]){ # not inflated
    # sample ordinal category
    y[i] <- sample(1:3, size = 1, replace=T, prob = theta)
  }
}

I Stan model code is:

data{
  int<lower=1> N;
  int J;
  int K;
  int<lower=1,upper=J> y[N];
  vector[K] thresh;
}

parameters{
  real alpha;
  real<lower=0> sigma;
  real<lower=0,upper=1> kappa;
}

model{
  vector[J] theta;
  alpha ~ normal(0, 5);
  sigma ~ normal(0, 1);
  kappa ~ beta(2, 2);
  
  theta[1] = Phi( (thresh[1] - alpha)/sigma);
  
  theta[2] = Phi( (thresh[2] - alpha)/sigma) - Phi( (thresh[1] - alpha)/sigma);

  theta[3] = 1 - Phi( (thresh[2] - alpha)/sigma);
  
  // likelihood
  for(n in 1:N){
    
    if(y[n] == 1){
      vector[2] lp;
      lp[1] = log(kappa);
      lp[2] = log1m(kappa) + categorical_lpmf(1 | theta);
      target += log_sum_exp(lp);
    }
    if(y[n] > 1){
      target += log1m(kappa) + categorical_lpmf(y[n] | theta);
    }
  }
}

However, running this model does not return the correct parameter values. In particular, it looks like the inflation component, kappa just draws from the prior.

Does anyone see an obvious mistake?

Thanks!

Code: ordinal-inflation.R (953 Bytes) ordinal-one-inflation.stan (752 Bytes)

Just to update this. It seems that this model is particularly difficult to estimate when there is not much structure in the data, compared to something like a zero-inflated Poisson likelihood which easily returns the parameter values for simple models.

By simply adding in a non-zero predictor variable, i.e.

x <- rnorm(N)
beta <- 1
mu <- alpha + beta * x
for(i in 1:N){
  theta <- rep(NA, 3)
  theta[1] <- pnorm( (1.5 - mu[i])/sigma )
  theta[2] <- pnorm( (2.5 - mu[i])/sigma ) - pnorm( (1.5 - mu[i])/sigma )
  theta[3] <- 1 - pnorm( (2.5 - mu[i])/sigma )
  
  if(is_inflated[i])  y[i] <- 1
  if(!is_inflated[i]) y[i] <- sample(1:3, size = 1, prob = theta)
}

[rest of the code the same]

I was able to correctly estimate the inflation component. Because most of my applications contain considerable structure (i.e. predictors, random effects), I didn’t catch this issue in the simpler applications.

Just to explain what is going on: when you have no predictors, low \kappa, and high \theta_1 can lead to exactly the same results as high \kappa, low \theta_1, so you have no way to get information about \kappa individually. But once you have at least a binary predictor, you are able to disentangle \kappa (which doesn’t change with predictors) from \theta_1 (which changes with predictors).

Now your model has some additional structure to \theta, so you probably can’t get exactly the same results, but I think it is likely changes in alpha and sigma can easily make theta[1] match almost any kappa. I would expect this to manifest as a correlation between kappa and alpha that should be visible in a pairs plot.

Does that make sense?

1 Like

@martinmodrak Yes, thanks, this is what is going on. I wrote my last post in a rush and didn’t show the pairs plots. In the models I have been applying to large data sets, there is considerable structure, and the pairs plots look good. When making a simple example, however, I took this for granted and forgot to look at the pairs plots.

I found your blog posts on non-identifiability really useful. Any ideas, beyond more informative priors, for helping this model?

1 Like

Happy to hear that :-)

Not sure what are the remaining problems: for the intercept-only model the problem is IMHO insurmountable - the data simply do not contain information that would let you determine \kappa with any precision. If I understand you correctly, the models with predictors work well - could you be more specific what you need to help with?

1 Like

Thanks @martinmodrak. I was interested in your thoughts on identifying the intercept-only model, but obviously there isn’t much that can be done.