I’m trying to implement an ordinal probit model with fixed threshold extremes (Liddell & Kruschke, 2018; Kruschke, 2014). The lower and upper limits of the thresholds are fixed to match the scale of the response. Rather than using the standard normal distribution, the \mu and \sigma of the underlying distribution are to be estimated.
I ran into two issues during sampling:
- There will be a lot of rejected initial values before sampling starts. Many times, sampling just won’t start because initial values all failed.
- I have to raise
adapt_delta
to some unusually large values such as 0.9999 to eliminate divergent transitions.
How can I improve my model and/or the sampling options to solve these problems?
JAGS implementation
This kind of models is employed in Liddell & Kruschke (2018) as well as in Chap 23 of Doing Bayesian Data Analysis 2nd ed (Kruschke, 2014). Below is Kruschke’s (2014) original JAGS implementation of a basic model for ordinal data from a single group.
Original JAGS model
model {
for ( i in 1:Ntotal ) {
y[i] ~ dcat( pr[i,1:nYlevels] )
pr[i,1] <- pnorm( thresh[1] , mu , 1/sigma^2 )
for ( k in 2:(nYlevels-1) ) {
pr[i,k] <- max( 0 , pnorm( thresh[ k ] , mu , 1/sigma^2 )
- pnorm( thresh[k-1] , mu , 1/sigma^2 ) )
}
pr[i,nYlevels] <- 1 - pnorm( thresh[nYlevels-1] , mu , 1/sigma^2 )
}
mu ~ dnorm( (1+nYlevels)/2 , 1/(nYlevels)^2 )
sigma ~ dunif( nYlevels/1000 , nYlevels*10 )
for ( k in 2:(nYlevels-2) ) { # 1 and nYlevels-1 are fixed, not stochastic
thresh[k] ~ dnorm( k+0.5 , 1/2^2 )
}
}
The two extremes of the thresholds are fixed to constant values:
thresh[1]=1+0.5
thresh[nYlevels-1] = nYlevels - 0.5
Stan implementation
Stan implementation of the above model has already been discussed in the linked thread:
I followed Gregory’s advice on the parameterization of the thresholds, and this is my Stan model:
data {
int<lower=0> N; // Ntotal
int<lower=2> K; // nYlevels
int<lower=1,upper=K> y[N];
}
transformed data {
real c_lower;
real c_upper;
c_lower = 1 + 0.5;
c_upper = K - 0.5;
}
parameters {
real mu;
real<lower=0> sigma;
vector<lower=c_lower,upper=c_upper>[K-3] c_;
}
transformed parameters {
ordered[K-1] c; // thresholds
simplex[K] theta; // probabilities
c[1] = c_lower;
c[K-1] = c_upper;
for (i in 2:(K-2)) {
c[i] = c_[i-1];
}
theta[1] = Phi((c[1] - mu) / sigma);
theta[K] = 1 - Phi((c[K-1] - mu) / sigma);
for (i in 2:(K-1)) {
theta[i] = Phi((c[i] - mu) / sigma) - Phi((c[i-1] - mu) / sigma);
}
}
model {
mu ~ normal((1 + K) / 2.0, K);
sigma ~ uniform(K / 1000.0, K * 10);
for (i in 1:(K-3)) {
c_[i] ~ normal(i + 1 + 0.5, 2);
}
y ~ categorical(theta);
}
Sampling issues of the Stan model
I fitted the model with rstan. The example data is from Kruschke (2014): OrdinalProbitData-1grp-1.csv (305 Bytes).
library(rstan)
df <- read.csv("OrdinalProbitData-1grp-1.csv")
S <- list(
N = length(df$Y),
K = 7,
y = df$Y
)
model <- "model.stan"
fit <- stan(model, data = S,
cores = 4, chains = 4, iter = 3000, warmup = 1000,
control = list(adapt_delta = 0.9999))
I ran into two issues:
1. Rejected initial values
There will be a lot of rejected initial values before sampling starts.
Many times, sampling just won’t start because initial values all failed.
Chain 1: Rejecting initial value:
Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: validate transformed params: c is not a valid ordered
vector. The element at 3 is 4.43762, but should be greater than the previous
element, 5.87994 (in 'model416d74ee2c8c_ordinal_1group' at line 27)
2. Divergent transitions
Even adapt_delta = 0.99
is not enough to remove the divergency.
I have to raise adapt_delta
to some unusually large values such as 0.9999 to eliminate divergent transitions.
Warning messages:
1: There were 2 divergent transitions after warmup. Increasing adapt_delta
above 0.99 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
2: Examine the pairs() plot to diagnose sampling problems
How can I improve my model and/or the sampling options to solve these problems?