Error about NAs in data

Hi -

I’m fairly new to STAN - apologies if this is a dumb question.
I’m using rstan in R to run some models on a HPC system (because models take many hours to run), but I’m running into a strange error that I can’t seem to figure out:

Error in FUN(X[[i]], …) :
Stan does not support NA (in transectID) in data
failed to preprocess the data; sampling not done

I’ve double-checked the data, and there are no NAs or missing values. When I start running the same model on my laptop, there are no errors and the model seems to runs fine. The only R packages I’m using are plyr and rstan. On my laptop I have R 3.6.1 with rstan version 2.19.2, plyr 1.8.4. On the HPC, it’s running R 3.5.1 with rstan 2.18.1, plyr 1.8.4. Could those differences in the versions of R/rstan be the source of my problem or should I be looking into something else entirely?

Thanks in advance for any advice you might be able to share!

It might be trying to convert a factor called transectID and creating NAs as a result. If so, try doing as.integer(transectID) when you create the list of data to pass to Stan.

Thanks so much for the reply - I really appreciate it. Unfortunately, that didn’t seem to fix the issue and I got the same error. Any other thoughts?

Do you have NA is a level of transectID?

What’s confusing is that transectID is just a vector of integers that index elements in another object. There aren’t any NAs in the transectID vector.

Can you share the data and/or the R code for preparing the data for stan?

Sorry for my delayed response - I’ve been swamped with other projects.

I was able to figure out that the issue isn’t related to the version of R or plyr/rstan being used - I get the same error with newer versions of everything.

Unfortunately I can’t share the data online, but I will try to post a version of the R code I’m using as well as the stan model as soon as I can.

Thanks for your willingness to help me sort out the issue!

1 Like

No worries. You can tag me when you post the code.

In case anyone was wondering, or had come across similar problems in their work, I was able to figure out the source of the problem that resulted in these strange NA errors.

The issue was solved by carefully specifying the number of cores needed to run MCMC chains in parallel on the HPC. I had to change a few things about how I submitted jobs to the HPC, and I also set cores=3 (or however many chains I wanted) in the call to stan instead of using the detectCores() function. This seems like an obvious thing to do in hindsight, but it took a while to realize that this was the source of my issues since the error message was so odd.

Apologies for not posting this sooner, and thanks to those who tried to help with the issue!

2 Likes