I encountered an issue when trying to fit a geometric mixture model with censored data, so I set up an
simpler version and encountered the following issue: When setting the probability of my random geometric sample below 0.05 there is no way that Stan finds the initial values, I tried zero initial values,
and only limiting the number of censored values to a maximum of 5 leads to a fit. Also setting a strong prior on the Intercept did not work out. I would be very grateful for any help here!
It throws out this error (many times): Chain 1: Rejecting initial value:
Chain 1: Log probability evaluates to log(0), i.e. negative infinity.
Chain 1: Stan canât start sampling from this initial value.
The data is right censored - for x_cens it is known that the uncensored value is at least as large the censored value. In principle it is the same situation as in the Stan manual just that each x_cens has an individual lower bound.
There must to be a difficulty evaluating the likelihood.
The question would be figuring out what combination of x_cens and Intercept causes problems.
Do you mind investigating this a bit more and report back?
I think you could use âexpose_stan_functionsâ to expose this function to R and then figure out where it is breaking (expose_stan_functions documented here: https://cran.r-project.org/web/packages/rstan/rstan.pdf â you might have to write a wrapper around neg_binomial_2_lccdf to get it to export, Iâm not sure).
Thanks to your explanation I was able to solve the issue, although in a presumably cheap wayâŚ
Since the algorithm only fails when the data is generated with a fairly small p for the geometric distribution, the model must have had problems with small values for âInterceptâ (the neg_binomial distribution is parametrised as 1/p which is a location parameter).
All I had to do to fit the thing was to pass initial values larger than ~ 20 (the true location parameter that generated the data is 100).
In terms of providing custom initial conditions, the goal is to make sure they are overdispersed so the Rhat diagnostics work right.
The goal is to have a bunch of chains start in a bunch of different locations and take a bunch of different paths and end up in the same situation.
This is to avoid having a bunch of chains start in a bunch of different locations, but because they are all so far away from the solution in sort of the same way, they take the same paths and you are tricked into thinking they have converged to one thing before you hit the solution.
So initial conditions normal(100, 20) would probably be fine (youâd want that 20 to be larger than what youâd expect the standard deviation of the posterior of the Intercept to be). normal(20, 1) means everything basically starts in the same place.
Very likely doesnât matter (itâs not like weâre doing anything but guessing with the defaults and are probably not doing it right a lot of the time), just lettinâ yah know.
Also I recommend you donât fix seeds. Even in development work. It leaves you vulnerable to weird seed-specific bugs, and I honestly donât think it gives you much (unless youâre hunting very specific C++ segfaults or something weird).