Hi,
Before I start and to prevent people from trying to avoid my long and not straight-forward model, I’m not expecting anyone to understand and debug the model for me, I think I have some experience with Stan, but I’m completely lost with what’s happening here.
I have a model of sentence processing which is struggling to start on different subsets of my data (reaction times of words in ~200 sentences).The likelihood seems to always exist for the initial value, but depending on the subset of data, the model will fail to start spitting “Gradient evaluated at the initial value is not finite.”
I’m attaching the model here ez_hmm_1.stan (7.6 KB) for reference. The thing is that I simulated the data, I fixed all the parameters except for one, and I identified some sentences that the model can’t deal with. That is, for some of the sentences, the model just run fine, but the following, for example, is problematic. As far as I can tell, there is nothing special here.
list(lnfreq = c(-3.21959716340167, -10.6723729927226, -3.96878841691886,
-3.78636690441316, -5.15911824237188, -3.52508933292279, -11.7190576704579,
-9.32291639942472, -3.21959716340167, -6.41942791246034, -6.75261349382341,
-5.55571242934684, -4.62808966562588, -8.76201613003949, -3.78636690441316,
-8.02507079950312, -3.52508933292279, -9.40283110709808, -8.24746545540302,
-8.94316117193497), wrapup_f = c(FALSE, FALSE, FALSE, FALSE,
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE), RT = c(166.490950141974,
300.90493554587, 181.674439595697, 221.628223694201, 152.920867070158,
308.119471128247, 125.048953278296, 295.45315025367, 172.506950010066,
125.644470180417, 317.8413699003, 163.524005409949, 180.510772105012,
131.258886202743, 381.342819843197, 150.222320717369, 206.504485197978,
162.884834661625, 384.054060772294, 164.271852972529), word_pos = c(1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
20), N_obs = 20L, verbose = 0, onlyprior = 0)
I’m using it to trying to debug the model, here I use one iteration, and an initial value of 0, and a print statement of the likelihood.
fit_ez_hmm <- stan("ez_hmm_1.stan",
data = ls_pred, init = 0, iter=1, chain = 1)
fit_ez_hmm
And I get :
The NEXT version of Stan will not be able to parse your Stan program.
Please open an issue at
https://github.com/stan-dev/stanc3/issues
if you can share or at least describe your Stan program. This will help ensure that Stan
continues to work on your Stan programs in the future. Thank you!
This message can be avoided by wrapping your function call inside suppressMessages().
c("0", "\nSyntax error in 'string', line 117, column 16 to column 17, parsing error:\n\nExpected a \";\" after \"print(...)\".\n\n\n")
hash mismatch so recompiling; make sure Stan code ends with a blank line
SAMPLING FOR MODEL 'ez_hmm_1' NOW (CHAIN 1).
Chain 1: [-4.28026,-5.06026,-4.54302,-6.03513,-4.54653,-5.75279,-5.25075,-5.90587,-4.54448,-5.43742,-5.7158,-4.53386,-4.67623,-5.08949,-6.39984,-4.67007,-5.29535,-4.64996,-6.53748,-4.59851]
Chain 1: [-4.28026,-5.06026,-4.54302,-6.03513,-4.54653,-5.75279,-5.25075,-5.90587,-4.54448,-5.43742,-5.7158,-4.53386,-4.67623,-5.08949,-6.39984,-4.67007,-5.29535,-4.64996,-6.53748,-4.59851]
Chain 1: Rejecting initial value:
Chain 1: Gradient evaluated at the initial value is not finite.
Chain 1: Stan can't start sampling from this initial value.
[1] "Error in sampler$call_sampler(args_list[[i]]) : Initialization failed."
error occurred during calling the sampler; sampling not done
Notice that the printed values in [] are the likelihood of every word in the sentence. So it’s not NaN or infinity, why can’t the model start?
I exposed the likelihood function and I tried with different values, and it seems fine. In fact, here I’m plotting it against possible values of alpha (the true value is 2, and it’s constrained to be positive and it’s the only parameter in my model in these tests), and it seems nice and smooth.
What can be happening here?
I’m using rstan_2.21.1, and I’ve also tried with cmdstanr_0.0.0.9008 using cmdstan-2.23.0.
Thanks!
Bruno