To simulate an ordinal outcome, you need to follow the correct data generating assumptions.
An ordinal outcome y is assumed categorically distributed with K category probabilities \mathbf{\pi} = \{\pi_{1}, \pi_{2}, ..., \pi_{K}\}:
The category probabilities are generated from an underlying continuous scale split up into K bins using K - 1 thresholds, \mathbf{\theta} = \{\theta_{1}, ..., \theta_{K - 1}\}. With a suitable CDF, often the logistic but you could also choose the normal CDF, we can work out the probability in each bin. If we refer to the standard logistic CDF (with mean=0 and scale=1) as F, then:
where I’m assuming that \theta_{0} = - \infty and \theta_{K} = \infty so that F(\theta_{0} - \eta) = 0 and F(\theta_{K} - \eta) = 1. The \eta term holds the linear predictor, i.e. your ordinal predictor variable.
In R, you’d have something like (just quickly wrote this, it might contain errors):
# Generate a single ordinal data point
# @K number of ordinal categories
# @theta vector of latent cutpoints
# @eta linear predictor
gen_ordinal_data <- function(K, theta, eta){
if(length(theta) != (K - 1))
stop(paste0("theta must have length ", K - 1, " not ", length(K)))
probs <- c()
for(k in 1:K){
if(k == 1) prev_thresh = -Inf
else prev_thresh = theta[k - 1]
if(k == K) next_thresh = Inf
else next_thresh = theta[k]
probs[k] = plogis(next_thresh - eta) - plogis(prev_thresh - eta)
}
return(sample(K, 1, prob=probs))
}
Then, you can plug whatever you want in for eta
including your ordinal predictor variables.