Strange error running rstan


#1

Hi,

Running a fairly simple model in rstan. The chains run, and no errors are reported. However, at the end of the run, I get what looks like an R error. Additionally, the
object that should contain the samples is empty.

The error is:

Elapsed Time: 2811.41 seconds (Warm-up)
              2038.34 seconds (Sampling)
              4849.75 seconds (Total)

Error in FUN(X[[i]], ...) : 
 trying to get slot "mode" from an object (class "try-error") that is not an S4 object 
In addition: Warning message:
In parallel::mclapply(1:chains, FUN = callFun, mc.preschedule = FALSE,  :
 4 function calls resulted in an error.   

Software:
R version 3.4.0
rstan version 2.15.1

Command that caused error:
post <- stan(“model.stan”, data = datlist, chains = 4, save_warmup = FALSE, refresh=1, iter = 1000, init_r = 0.1, control = list(adapt_delta = 0.9, max_treedepth = 8))

Is this something I may have caused, or is there a bug in rstan?

Thanks!


#2

Guessing undefined variable in generated quantities


#3

If you want help, giving us the model is a big help.

@bgoodri — why would you think generated quantities? Wouldn’t that cause everything to stop earlier? How can we get an error after sampling anyway? Is there an error message from Stan getting swallowed somehwere (@mitzimorris just patched the problems of rejections masking prints for 2.16).


#4

@bgoodri

Ben is absolutely correct. In generated quantities, I had two vectors defined, but only used one of them.

 generated quantities {
   vector[N] y_rep;
   vector[NG] ll;
 
   for(i in 1:N){
   	y_rep[i] = normal_rng(theta, sigma);
   }
 
 }

Deleting ll from the generated quantities block immediately solved the problem. My intuition is that rstan saw it defined, so is looking for the corresponding samples. Since none exist, we get the error.

Thanks Ben!


#5

Hi,

I’m getting the exact same error on a hierarchical model with multiple sets of random effects on a very large data set (~600K observations). I’ve been running the program in batch mode on a Linux server with plenty of memory. I didn’t get the error when running on a smaller fake data set, and it seems to do ok on a random subset of the actual data. If I run a single chain or not in parallel, the program just quits with no error. When I run in parallel, it posts the initial “SAMPLING FOR MODEL” statements and immediately quits with the error.

I’ve attached the stan program in case it is helpful, and I’d appreciate any ideas you have!

p.s. it seems like it could also be related to this google groups post, but I didn’t see a resolution on there: https://groups.google.com/forum/#!topic/stan-users/YjN-km1b_-E

civ_hlm.R (3.4 KB)


#6

You can write that program much more cleanly with vectorization. Rather than

transformed parameters {
  vector[N] mu;
  for(i in 1:N) {
	mu[i]=X[i,]*beta +
	(intslp[1] + u[1,id[i]] + o[1,occ[i]] + f[1,fy[i]] + l[1,loc[i]] + d[1,disab[i]] + v[1,vet[i]] + intslp_fem[1]*s[i] + r[sex[i],1,race[i]]) +   
    (intslp[2] + u[2,id[i]] + o[2,occ[i]] + f[2,fy[i]] + l[2,loc[i]] + d[2,disab[i]] + v[2,vet[i]] + intslp_fem[2]*s[i] + r[sex[i],2,race[i]])*rt[i] + 
    (intslp[3] + u[3,id[i]] + o[3,occ[i]] + f[3,fy[i]] + l[3,loc[i]] + d[3,disab[i]] + v[3,vet[i]] + intslp_fem[3]*s[i] + r[sex[i],3,race[i]])*t[i] ;

you can write

vector[N] mu = X * beta + intslp[1] + u[1, id] + o[1, occ] + f[1, fy] + ...

But it looks like you have multiple intercepts here, which doesn’t bode well for identifiability. YOu really just want a single global unconstrained intercept or everything’s only being identified by priors.

Also, in the random effects, you probably are going to want a non-centered parameterization (it applies to both the mean and the log scale).

@bgoodri — is the issue that RStan post-processing can’t handle NaN coming from anywhere? If so, the way to debug is to figure out where the NaN are getting produced or wait for us to fix the issue.


#7

Thanks for the tip on vectorization–I’m sure you can tell from my code that I’m new to stan.

Where do you see the multiple intercepts? I’m intending to have one intercept (intslp[1]), and have the other effects be mean-zero and vary randomly around it. The X matrix doesn’t contain an intercept term, and s is a dummy variable.

Thanks again for the assistance on the error.


#8

Directly in the term I copied. You have intslp[1] + intslp[2] + intslp[3] in the definition of mu. That’s more efficient and cleanly coded as sum(intslp), for what it’s worth.


#9

This doesn’t have a generated quantities block, so it must be a different issue. Can’t tell much without the data.


#10

I see what you’re saying. I’m attempting to have intslp[2] and [3] be coefficients on rt and t, so that it is intslp[1] + intslp[2]*rt + intslp[3] * t. And then the intercept and two coefficients have random variation across the group levels.

I ran it last night on a random sample of 1000 observations and the chains completely ran, but it gave an additional error at the end along with the original one. Here is the complete text:

Error in sendMaster(try(eval(expr, env), silent = TRUE)) :
long vectors not supported yet: fork.c:376
Calls: stan … -> lapply -> FUN -> mcparallel -> sendMaster
Error in FUN(X[[i]], …) :
trying to get slot “mode” from an object (class “try-error”) that is not an S4 object
Calls: stan … sampling -> sampling -> .local -> sapply -> lapply -> FUN
In addition: Warning message:
In parallel::mclapply(1:chains, FUN = callFun, mc.preschedule = FALSE, :
4 function calls resulted in an error
Execution halted

Unfortunately, I can’t share the data. But let me know if you think of anything I can try or if the additional information gives you guys any leads. Thanks again,


#11

This was an R limitation that was fixed in version 3.4.


#12

I found something that could be causing the original issue: I had defined one of my random effects (“o”) to be longer than the number of values in the index. I.e. o was a 3x23 matrix, but there were only 21 distinct occ values and 2 spaces in that matrix were empty. Would this cause the NaN issue you mentioned earlier, Bob? I fixed this, and it is now running without kicking out like it was before.

Good to know about the long vector fix–if I still get that error I can see about upgrading to 3.4 on the server.


#13

Little bit


#14

I hit the same error when I use a large dataset. After trying several changes, I found that the data size is the issue for this error.
I get the same error as @David_Schulker:

Error in FUN(X[[i]], ...) : 
  trying to get slot "mode" from an object (class "try-error") that is not an S4 object 
In addition: Warning message:
In parallel::mclapply(1:chains, FUN = callFun, mc.preschedule = FALSE,  :
  2 function calls resulted in an error

And sometimes I get:
Segmentation fault (core dumped)

Any tips to get rid of this? I have enough memory, but I wasn’t successful in increasing R’s limits (if there’s any).


#15

I would do save_warmup = FALSE if you are not doing so already.


#16

Thanks. I tried that, but it doesn’t help.


#17

One way or another, you have to save less. You can put fewer things into transformed parameters. You can use the pars and include arguments to only save the full output on parameters of interest. You can save nothing (with pars = '') and write the results to csv files (via the sample_file argument). And so on.


#18

Thanks @bgoodri! I will try these options and update.