MCMC issue with rejecting initial value

I am currently trying to approximate the Poisson distribution with Normal for my prior in the Bayesian model. The code has no error, but then I keep encountering this error, which I think is due to the very sparse data feed into the model.

data {
  int<lower=1> ncol;
  int<lower=1> nrow;
  vector[ncol] yH;
  vector[nrow] x;
  #int x[nrow];
  matrix[nrow, ncol] A;
  vector[nrow] sigma_x;
  vector[ncol] sigma_y;
  #matrix[nrow,ncol] sigma_x;
  #matrix[nrow,ncol] sigma_y;
  vector[nrow] epsilon;
}

parameters {
   vector<lower = 0>[ncol] yT;
}

model {
  yT ~ normal(yH, sqrt(yH)); //to approximate the fact that yT ~ poisson(yH)

   x ~ normal(A*yT + epsilon, sigma_x);
   //x ~ multi_normal(A*yT + epsilon, sigma_x);
}

I used the following R code to feed in the data and run:

foreach(i = 1:20) %dopar% {
nrow = nrow(rte_m[[i]]);
ncol = ncol(rte_m[[i]]);
A <- as.matrix(rte_m[[i]]);
sigma_x <- as.vector(sample.int(10, nrow(kf_vect[[i]]), replace=TRUE))
sigma_y <- as.vector(eps_vect[[i]])
yH <- as.vector(dh_vect[[i]]$X);
yT <- yH + as.vector(eps_vect[[i]]);
epsilon <- sample.int(15, nrow(kf_vect[[i]]), replace=TRUE)
x <- round(as.vector(as.matrix(rte_m[[i]])%*%yT) + epsilon)
iterations = 200;
#input it into our Stan model file “stamodeling.stan”
stanmodel1 <- stan_model(file = “poissnorm.stan”,
model_name = “stanmodel1”);

#NUTS (No U-Turn) sampler to generate posterior distribution
stanfit <- sampling(stanmodel1, cores = parallel::detectCores(), data = list(ncol = ncol,nrow = nrow,
yH = yH,
x=x, epsilon = epsilon,
A = A, sigma_y = sigma_y)
,iter=iterations, chains = 3, control = list(max_treedepth=13));
print(i)

When I run, it starts very fast, but then it gave me this error message that I cannot fix:

Rejecting initial value:
Error evaluating the log probability at the initial value.
Exception: normal_lpdf: Random variable[1] is nan, but must not be nan! (in ‘model23728163056_stanmodel1’ at line 24)

I remember I could add something to change the initial starting point of MCMC, but I am not sure if that even helps in this case. Can someone please help me on this issue? I encountered it twice, but failed to overcome it:((

It’s time to add some print statements on the calculations feeding into line 24 and figure out what turns into a NaN.

Thank you for your help. I checked that too, and nothing was NaN before it was put into the blackbox sampling(). My guess is because the data is too “weird” that when it was put into the blackbox, the Jacobian’s calculation got screwed (please give it a run yourself if you could). It is quite heavy though, but I am surprised the above does not work because when I change \sqrt{yH} into \sigma_y, the code works (but very slow).

Data Link

Add print to the Stan program.

1 Like

I just tried it. Nothing came out as “outlier.” No NaN. So weird…

Is sqrt(yH) equal to sigma_y? If any of the yH is negative, you’ll run into problems.

I don’t see where the X came from defining yH in the R code.

1 Like

\sqrt{yH} is not the same as sigma_y. The reason is I want to approximate Poisson(yH) by Normal(yH, \sqrt{yH}) (as I cannot achieve a successful marginalization with the actual Poisson for the parameter yT). The x came from this line:

x ← round(as.vector(as.matrix(rte_m[[i]])%*%yT) + epsilon)

One line above shows the link between yT and yH. So X does come from yH, just not directly based on my R code:)

All components of vector yH are greater than or equal to 0. Lots of 0 though, so this might cause the trouble? If that’s the case, I plan to add a uniform number between (0,5) into \sqrt{yH}, although that makes the variance of N(yH, \sqrt{yH}) not equal to that of Poisson(yH)