Rejection of initial values

Hello,
I am new to this forum. I have an issue with my implementation of the Multivariate Stochastic Volatitlity model as in Meyer and Yu (2006). I Implemented the model in stan (see the attached MSV.stan file).
I simulate data and try to fit the model with the following R code. However I reveive the following error message.

Initialization between (-2, 2) failed after 100 attempts.
Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”
[1] “error occurred during calling the sampler; sampling not done”

and I can not figure out what the problem is. For debugging I printed out the paramter values but they are fine. THe content of the diagnostic_file

Sample generated by Stan

stan_version_major=2

stan_version_minor=16

stan_version_patch=0

init=random

enable_random_init=1

seed=1692406610

chain_id=1

iter=1000

warmup=500

save_warmup=1

thin=1

refresh=100

stepsize=1

stepsize_jitter=0

adapt_engaged=1

adapt_gamma=0.05

adapt_delta=0.8

adapt_kappa=0.75

adapt_t0=10

max_treedepth=10

sampler_t=NUTS(diag_e)

diagnostic_file=diag.txt

append_samples=0

does help me either, so I hope that any of you might know how to help me.

Best regards and thanks in advance

Chris

library(rstan)

simMSV <- function(N,k,B, Seps, mu, phi, Seta,TT){
library(mvtnorm)
eta <- matrix(rmvnorm(TT,c(rep(0,k)),Seta),nrow = TT,ncol = k)
u <- matrix(rmvnorm(TT,c(rep(0,k)),diag(k)),nrow = TT, ncol = k)
eps = matrix(rmvnorm(TT,c(rep(0,N)),Seps),nrow = TT, ncol = N)
h <- matrix(rep(0,kTT),nrow = TT,ncol = k)
h[1,] <- mu
for (t in 2:TT){
h[t,] <- mu + phi
(h[t-1]-mu) + eta[t-1,]
}
f <- exp(h/2)u
y <- B%
%t(f) + t(eps)
y = t(y)
result = matrix(cbind(y,exp(h/2)),nrow = TT,ncol = k+N)
colnames(result)<-c(sprintf(“Y%d”,seq(1,N,1)),sprintf(“vol%d”,seq(1,k,1)))
return(result)
}
R = 2
B = matrix(c(1,1,1,0,1,1.2),nrow = 3,ncol = 2)
Seps = matrix(c(0.49,0,0,0,0.09,0,0,0,0.2),nrow = 3, ncol = 3)
mu = c(-1,0.5)
phi = c(0.99,0.9)
Seta = matrix(c(0.1,0,0,0.4),nrow = 2,ncol = 2)
TT = 100
N = 3
k = 2

res = vector(“list”,R)
data = vector(“list”,R)
vol = matrix(data = NA, nrow = TT,ncol =R)
X = vector()
for (i in 1:R){
x = simMSV(3,2,B,Seps,mu,phi,Seta,TT)
X = cbind(X,x)
data[[i]] = list(y = t(x[,1:N]),TT=TT, N = 3, k = 2)
vol[,i] = t(x[,3])
}
write(X, file = “True_data”)

BurnIn = 500
Sampling = 1000
thin = 1
nchains = 1
#MonteCarlo <- foreach(i = 1:R, .combine = cbind, .packages = ‘rstan’) %dopar% {
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
fit <- stan(file = ‘MSV2.stan’, data = data[[i]], pars = c(‘B’,‘phi’,‘mu’,‘seta2’,‘seps2’,‘h’),
iter = Sampling,warmup = BurnIn,thin = thin, chains =nchains,cores = nchains,diagnostic_file = “diag”)
res = summary(fit,probs = c(0.025,0.5,0.975))$summary

MSV2.stan (1.9 KB)

Wasn’t there more error output? There should be some more specific messages.

This is with initial values equal to zero:

fit ← stan(file = ‘MSV2.stan’, data = data[[i]], pars = c(‘B’,‘phi’,‘mu’,‘seta2’,‘seps2’,‘h’),

  •           iter = Sampling,warmup = BurnIn,th .... [TRUNCATED] 
    

DIAGNOSTIC(S) FROM PARSER:
Warning: integer division implicitly rounds to integer. Found int division: ((((2 * k) * N) - (k * k)) - k) / 2
Positive values rounded down, negative values rounded up or down in platform-dependent way.
WARNING: left-hand side variable (name=h) occurs on right-hand side of assignment, causing inefficient deep copy to avoid aliasing.

z[[1,-2147483648],[2,3]]

SAMPLING FOR MODEL ‘MSV2’ NOW (CHAIN 1).
h =[[0,0],[0,0],[0,0],[0,0],[0,0],[0,0],[0,0],[0,0],[0,0],[0,0]]
B = [[1,0],[0,1],[0,0]]
phi = [0,0]
mu = [0,0]
Seps2[[1,0,0],[0,1,0],[0,0,1]]
Seta2[[1,0],[0,1]]
[0,0,0][[1,0,0],[0,1,0],[0,0,0]]

Rejecting initial value:
Error evaluating the log probability at the initial value.
Exception: multi_normal_lpdf: LDLT_Factor of covariance parameter is not positive definite. last conditional variance is 0. (in ‘model3318d81138e_MSV2’ at line 76)

[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”
[1] “error occurred during calling the sampler; sampling not done”

res = summary(fit,probs = c(0.025,0.5,0.975))$summary
Stan model ‘MSV2’ does not contain samples.

capture.output(res,file = sprintf(“intermediate_results_%s.txt”,startingdate), append = TRUE)

write(i,file=sprintf(“progress_%s.txt”,startin … [TRUNCATED]

This same comes if I supply user specified initial values and with random initial values, the same message appears 100 times.

The most common causes for something like this is local variables in transformed variables or the model block that haven’t been initialized (they default to NaNs/arrays of NaNs/matrices of NaNs). This causes problems with basically everything, and the target lp__ gets set to NaN, and everything throws up.

So I’d look for those if I were you. Commenting out sections of code is an effective way of hunting these types of bugs down.

Also make sure your data doesn’t have any NaNs in it – that’ll cause it too.

So i tried what you suggested. I printed out all of the transformed parameters and transformed data. plus all the parameters. Everything looks nice.
There is however one thing that I might suspect. In the transformed data block, I built up matrix B, which has dimensions (N x k), and thus is not necessarily symmetric. This matrix however should have only ones on the diagonal and zeros on the upper triangle. This is where I used the auxiliary matrix z for, that i generate in the transformed data part. Initially I wanted to do something like this:
int tmp ;

tmp = 1;
for (i in 1:k) {
for (j in i:k) {
B[1:i-1,i] = rep_vector(0,i-1);
}
B[i,i] = 1;
}
for (i in 1:k) {
for (j in i:N-1) {
B[j+1,i] = b[tmp] ;
tmp = tmp + 1;
}
}

but Stan said it is not possible to define integers in the parameter or the trans parameter block. So I tried with real, but than tmp is not accepted as a vector index. Defining tmp in another block also did not work. So I came up with the following idea:

transformed data {
int Z ;
int z[N-1,k] ;
int tmp ;
Z = (2kN - k*k - k)/2 ;
tmp = 1;
for (i in 1:k) {
print(“tmp”,tmp)
for (j in i:N-1) {
z[j,i] = tmp ;
tmp = tmp + 1 ;
}
}
which works. However I get a N-1 x k array of integers, but i do not need all of them.
this is what is printed for z, given that N = 3 and k = 2.
z[[1,-2147483648],[2,3]]
I only use the integer part of z so the -0.2147… is not used later on.

Could this cause the error? If yes, any suggestions on how to increase the index of a vector in each loop iteration by 1?

Best and thanks a lot
Chris

You can define the int variable in a local block in the parameter block. Not sure about this but I’m pretty sure it’s true. Something this partial thing:

transformed parameters {

vector[3] x blah;

{
int tmp;
for (i in ...
  tmp = tmp + 1;

}

So your suggestion worked. I do not need the auxiliary matrix z anymore. However I still get the same error.

Bummer, next step is probably just comment out sections of the likelihood until things do work, and then trace back from there.

Problem solved. My variance covariance matrix was missing a part :( sorry for causing you guys the trouble. Now the model runs. I had to add Seps2 to the varCov matrix of y.

2 Likes