Initialization error in Ordered Probit Model

Please share your Stan program and accompanying data if possible.


When including Stan code in your post it really helps if you make it as readable as possible by using Stan code chunks (```stan) with clear spacing and indentation. For example, use

model {
  vector[N] mu = alpha + beta * x;
  y ~ normal(mu, sigma); 
} 

instead of

model{
vector[N] mu = alpha+beta*x;
y~normal(mu,sigma);
}


To include mathematical notation in your post put LaTeX syntax between two $ symbols, e.g.,
p(\theta | y) \propto p(\theta) p(y | \theta).

#preparing the data
library(rstan)
Sys.setenv(LOCAL_CPPFLAGS = ‘-march=native’)
rm(list=ls())
set.seed(1234)
x <- as.matrix(runif(n=10000, 1, 10))
beta <- rnorm(n=10000, 5, 2)
g <- 5
alpha <- 4
A <- rnorm(n=10000, 8, 4)#random effect
b <- rnorm(n=10000, 10, 3)#random effect
u <- xg + betaalpha + A + b
theta <- c(53,65,77)
z <- numeric(10000)

for (i in 1:10000) {
if (u[i] < 53) {
z[i] <- 1
} else if(53 <= u[i] & u[i] < 65){
z[i] <- 2
} else if(65 <= u[i] & u[i] < 77){
z[i] <- 3
} else {
z[i] <- 4
}
}

K=as.integer(4)
N = as.integer(10000)
D = as.integer(1)
sim_data <- list(K=as.integer(4),
N = as.integer(10000),
D = as.integer(1),
z = z,
x =x,
alpha = alpha,
beta = beta,
A = A,
b= b)

stanmodel_code <- ’
data {
int<lower=2> K;
int<lower=0> N;
int<lower=1> D;
int<lower=1,upper=K> z[N];
row_vector[D] x[N];
real alpha;
row_vector[N] beta;
row_vector[N] A;
row_vector[N] b;
}
parameters {
vector[D] g;
ordered[K-1] theta;
}
model {
vector[K] w;
for (n in 1:N) {
row_vector[N] u;
u[n] = x[n]*g + beta[n]*alpha + A[n] + b[n];
w[1] = 1 - Phi(u[1] - theta[1]);
for (k in 2:(K-1))
w[k] = Phi(u[n] - theta[k-1]) - Phi(u[n] - theta[k]);
w[K] = Phi(u[n] - theta[K-1]);
z[n] ~ categorical(w/sum(w));
}
}’

mod <- stan_model(model_code = stanmodel_code, verbose = TRUE)
fit <- sampling(mod, data = sim_data, iter= 1000)
fit_vb <- vb(mod)

list_of_draws_vb <- extract(fit_vb) #extract the output
mean(list_of_draws_vb$g)
mean(list_of_draws_vb$theta[,1])
mean(list_of_draws_vb$theta[,2])
mean(list_of_draws_vb$theta[,3])

This is the error:
Chain 1: Rejecting initial value:
Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: Phi: x is nan, but must not be nan! (in ‘model36a85db92aa4_99256f95fd8db93be7642f2f8c126869’ at line 22)

Chain 1:
Chain 1: Initialization between (-2, 2) failed after 100 attempts.
Chain 1: Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
Error in sampler$call_sampler(c(args, dotlist)) : Initialization failed.

1 Like

I’ve tried to read through your code, and I think the problem is in this line:

Here you use g, which is defined before but never assigned any value. This may be the cause on the NaN error reported.

Note also that you define u as a row_vector, but you only ever reference one element at a time, so you could just declare a real.

On the R side, use 4L instead of as.integer(4)

2 Likes