CMR_Multisite: generalisation of the model to have k sites

Hi Stan and Bayesian community,

I’m trying to modify and use the model of Capture Mark Recapture displayed in the book “Bayesian Population Analysis using Winbugs”.
My aim is to modify the model translated into Stan (here: https://github.com/stan-dev/example-models/blob/master/BPA/Ch.09/ms3_multinomlogit.stan ) to have k sites and not just 3.
But when i test my model, I have two issues:
• it is impossible to run parallel chains :“Error in unserialize(socklist[[n]]) : error reading from connection”
• and just with one chain the rsession crash and a message about a problem of transmission appeared

I tried to run my model on two different computers (OSX for one and Unbuntu for the other) with 16 Go of RAM, and the latest versions of RStudio, R, and RStan.
I don’t know where is the problem …
Maybe it’s because I have too many indices or objects of 4 dimensions such as “simplex[n_states] ps[n_states, N, n_occ_minus_1]”; .

My script to simulate data and with the model code is here: https://github.com/ColinBouchard/CMR_Multisite/blob/master/T_Test_Script_ModelFit_Stan_CMR_MultiSites.Rmd

Thanks very much in advance for your help.
Colin

Operating System:
Interface Version:
Output of writeLines(readLines(file.path(Sys.getenv(“HOME”), “.R/Makevars”))):
Output of devtools::session_info("rstan”):

This sounds like an installation / setup issue. However, we need the exact error message(s). Also, we need actual output for these two inputs:

My other Stan models run without any issues on these two computers.
One error message is : “Error in unserialize(socklist[[n]]) : error reading from connection” when I try to run parallel chains
The other message is: “Error occurred during transmission”. It’s the same than when I kill a job corresponding of a running model in the Terminal (console).

I don’t have the package Makevars on the both computers.

Session info on my Mac:
setting value
version R version 3.4.1 (2017-06-30)
system x86_64, darwin15.6.0
ui RStudio (1.0.143)
language (EN)
collate en_US.UTF-8
tz Europe/Paris
date 2017-08-25

And on the other computer:
Session info ----------------------------------------------------------------------------------------
setting value
version R version 3.4.1 (2017-06-30)
system x86_64, linux-gnu
ui RStudio (1.0.153)
language (EN)
collate en_US.UTF-8
tz Europe/Paris
date 2017-08-25

It runs for me if I change line 513 from lpsi[k][j] ~ normal(0, sqrt(1000)); to lpsi[k][j] ~ normal(0, sqrt(1000.0));. Or it would also work if you declared that standard deviation in the data block and passed it from R. This issue of passing numbers to math functions in the C++ library pops up in two out of every three Stan releases, so it is best avoided.

I have the same error (“error occurred during transmission”) even if I change the line like you.

Do the errors occur when it is called from a shell rather than RStudio. If you google for RStudio "error occurred during transmission" there are a number of hits.

I tried directly in the shell with 2 chains in parallel and 50 iterations because the model runs very slowly…


Elapsed Time: 445.05 seconds (Warm-up)
1078.94 seconds (Sampling)
1523.99 seconds (Total)


Elapsed Time: 568.986 seconds (Warm-up)
1122.6 seconds (Sampling)
1691.58 seconds (Total)

And there was a new error message (probably the error leading to the previous errors in RStudio)
The message is in French sorry (translation: Attempt to obtain the “mode” slot of an object of an elementary class (“NULL”) without slots)

Error in FUN(X[[i]], …) :
tentative d’obtenir le slot “mode” d’un objet d’une classe élémentaire (“NULL”) sans slots

The relevant translation is:

#: src/main/attrib.c:1302
#, c-format
msgid “trying to get slot “mode” from an object of a basic class (“NULL”) with no slots”

The error message appears when the sampling is done; just after the message “elapsed time…” and before the creation of the Stan-fit object.

I think you have encountered a bug. For me, the model just dies when it starts to sample. Here is all I could get from GDB

Thread 1 "R" received signal SIGSEGV, Segmentation fault.
0x00007ffff78eadd3 in RecursiveRelease (object=0x55555afd92f0, object@entry=<error reading variable: DWARF-2 expression error: Loop detected (257).>, list=0x55559c41f8f8) at memory.c:3281
3281    memory.c: No such file or directory.
(gdb) bt
#0  0x00007ffff78eadd3 in RecursiveRelease (object=0x55555afd92f0, object@entry=<error reading variable: DWARF-2 expression error: Loop detected (257).>, list=0x55559c41f8f8) at memory.c:3281
#1  0x00007ffff78eadd8 in RecursiveRelease (object=<error reading variable: DWARF-2 expression error: Loop detected (257).>, list=0x55559c41f8c0) at memory.c:3281
#2  0x00007ffff78eadd8 in RecursiveRelease (object=<error reading variable: DWARF-2 expression error: Loop detected (257).>, list=0x55559c41f888) at memory.c:3281
...
#? 0x00007ffff78eadd8 in RecursiveRelease (object=<error reading variable: DWARF-2 expression error: Loop detected (257).>, list=0x55559c41f888) at memory.c:3281

At first I thought the problem was that your first_capture function didn’t have a return statement outside the loop, which generates a compiler warning, but that is not it. Also odd is that somehow CmdStan runs without segfaulting. Maybe @Bob_Carpenter can see what is going wrong.

It’s not the first_capture function because I tried by running this function in R and give the result in the data block but the same error continues to occur … with another error message :

Error in FUN(X[[i]], …) :
trying to get slot “mode” from an object (class “try-error”) that is not an S4 object
In addition: Warning message:
In parallel::mclapply(1:chains, FUN = callFun, mc.preschedule = FALSE, :
2 function calls resulted in an error

I found the solution. By adding pars = c('p', 'phi', 'psi') in the sampling function the model works well. So the error was due to the matrixes po and ps which are too big and lead to this error when they are save in Stan.

Thank you so much for your help and time @bgoodri

Are the matrices too big and mostly empty or is this really a huge MR study?

I simulate data to test if this model could be used for a next field monitoring. And in my code I have two objects:
vector[n_states] ps[n_states, N, n_occ_minus_1];
vector[n_states] po[n_states, N, n_occ_minus_1];

Then I think that the problem is here. For example, 10 sites, 40 individuals and 60 occasions lead to have two matrices with these dimensions: 10, 10, 40, 60.
I’m little bit surprised by this limitation, but when I add the argument pars = c('p', 'phi', 'psi') in the sampling function, the model works well so …

That’s only 240k entries for each one which is 100% fine for the Stan library. It is possible that rstan does some munging that’s not safe w.r.t to the memory footprint OR every once in a while an issue pops up with R/Rcpp management of memory with R’s lists (which rstan uses). Can you put together the current version of the run script and Stan model s.t. it’s portable? I’m happy to run it to figure out what’s at fault so we can see if we need to file an issue but I’m not willing to troubleshoot the script.

OTOH, you just want something that works so I suggest trying it in CmdStan and then just import the .csv file… this is why I just use CmdStan…

You can check the code on my GitHub: https://github.com/ColinBouchard/CMR_Multisite
It’s a RMarkdown file with my code to generate the data so you can test the combination that you want (do not hesitate if you have a question on the code).
But if you change the number of sites and thus the number of states you have to change it also in the model here (at the place of 101): //y_i: vector of observations of the individual i if (y_i[k] != 101).

If you delete the pars = c('p', 'phi', 'psi') in the sampling function at the end of the file the model will not work. So you can try to see ! :)

Ok, I have never test CmdStan so I will check ! Thanks!
Can you tell if you find the true error source of the mode? Because I’m very interested to understand this error.

Ok, that was straightforward enough to run, how long does it take to complete?

It’s pretty slow so if it’s just to detect the error, I suggest you to try with 10 sites, 40 individuals and 60 occasions and with just 50 iterations with two chains. But delete the pars = c('p', 'phi', 'psi').

Wow, it’s really slow. I’m happy to check out the error but I don’t want to edit a script I’m not familiar with. If you can edit it so it runs in a better defined amount of time and post the full script here I’ll check it out.

For sure!