In my generated quantites section I have the below code. Notice that the three first vectors are of n_trials size. Now, I incorrectly assigned n_patients zeros to these vectors. The number of patients is much larger than the number of trials here.
Stan compiles this and does not report any errors (I guess at compile time it doesn’t know what is in those size variables). At runtime, I get the below inexplicable error:
Storage capacity [37401] exceeded while writing value of size [211] from position [37395]. This is an internal error, if you see it please report it as an issue on the Stan github repository. (in '/tmp/Rtmptz2aEN/model-70e1a0dcb08.stan', line 136, column 10 to column 38)
What I would expect is that at runtime an error would be reported right at when the initialization of incorrect size is happening.
Me, too. That was my original intent with the language design. If we had a do-over, that’s what we’d do. The problem we have is that many Stan programs now depend on this bug and fixing it would break a lot of programs. So we’ve decided not to fix it. Sorry if it’s making it harder to debug.
Thanks for your response @Bob_Carpenter. So how is this bug used? Just asking to see if I can also take advantage of it! :-) Does it by any chance allow things like
Using your code as a generated quantities block and creating arbitrary values for the number of patients and trials did yield an exception (e.g. NOT an internal error):
Exception: vector assign rows: assigning variable trial_K_obs_condition_number (37401) and right hand side rows (67401) must match in size (in '../../ml/stanc3/foo.stan', line 6, column 2 to column 75)
If this wasn’t occurring for you, it’s worth trying to understand why
Are you using STANCFLAGS=--O1 by any chance? Removing some bounds checks is a known limitation of that setting that we haven’t had developer cycles to address properly yet.
(This is part of why it is generally a good idea to do model development without --O1, and turn it on after you’ve done at least a small run of the unoptimized model)
As a matter of fact, yes, I do have stanc_options = list("O1")!
A follow-up question, I love the idea of being able to have ragged arrays/vectors. Do this “bug” allow us to have for example array[n] vector[m] x; and then set arbitrarily sized values for x[1] = zeros_vector(m + 1); x[2] = zeros_vector(m - 1);? When would I get the storage internal error then? Is there a risk of overwriting other variables? Does it re-allocate memory dynamically when I change the size?
When the sizes don’t match, Stan ends up inheriting “whatever C++ does”, which in this case isn’t even defined in the C++ spec. If the sizes are “close enough” you might observe the behavior you desire, or you might get crashes, or you might get demons flying out of your nose, or…