I have a program that, while syntactically correct, does not actually run in either rstan or CmdStan.
I took inspiration from the Mark-Recapture model in the Stan User Manual.
The specific error I receive is
SAMPLING FOR MODEL 'model' NOW (CHAIN 1).
R(55719,0x7fffa061a380) malloc: *** error for object 0x7f8eb43e7758: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
Error in unserialize(socklist[[n]]) : error reading from connection
SAMPLING FOR MODEL 'model' NOW (CHAIN 2).
R(55747,0x7fffa061a380) malloc: *** error for object 0x7fadc8024768: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
The program is below.
data {
int<lower=2> N; // number of sampled specimens
int<lower=1> Hstar; // number of unique haplotypes
int<lower=2> perms; // number of permutations
int HAC_mat[perms]; // array of sampled haplotype labels
}
transformed parameters {
real<lower=N> Nstar;
vector<lower=1,upper=Hstar>[perms] H;
Nstar = (N * Hstar) / mean(H);
}
model {
HAC_mat ~ binomial(Hstar, N / Nstar);
}
I have removed the declaration for H (that was a mistake on my part) and would need instead Nstar to be
Nstar = (N * Hstar) / mean(HAC_mat);
However, the mean function does not work on integer arrays. I could change HAC_mat to be a real array (even though itās not), but then the binomial distribution in the model block would obviously need to be changed.
I havenāt come across any solution in the Userās Guide.
Your transformed parameters block doesnāt contain transformations of parameters but rather transformations of data. So you can move it to the transformed data block and it will run faster.
Not having a parameters block means you need to use the fixed_param version of the sampler. In RStan I think itās stan(..., algorithm="Fixed_param"). In CmdStanR itās $sample(fixed_param=TRUE).
Not having a parameters block or a generated quantities block means youāre not estimating anything or generating anything. What exactly are you trying to do here?
I am basing my model on the Lincoln-Peterson mark-recapture Stan code found in the User Guide.
Essentially, I have an estimate of Nstar from an R package of mine, but I donāt have a good idea as to its standard deviation (or other quantities of interest). Thus, I thought Stan might be a natural way to get this information.
Nstar is computed as
Nstar = (N * Hstar) / H
where N and Hstar are both integers, whereas H is a mean that takes on a real value between 1 and Hstar. H corresponds to the number of recaptures ( R ) in the mark-recapture model, except that H it is not an integer. HAC_mat contains āpermsā number integers {1, ā¦, Hstar}.
I place a binomial prior on the elements of HAC_mat. Next, I take the mean of these values. Finally, I plug all the information into the above equation for Nstar, No other probability distribution seems to work (due to type declarations required).
I would then get all required information from the Stan output.
Yeah exactly. The Stan program is syntactically correct but itās not estimating anything.
Here both N and Nstar are constants so thereās nothing for Stan to learn (Also, if Nstar = (N * Hstar) / mean(HAC_mat) then isnāt N/Nstar just mean(HAC_mat)/Hstar, so you donāt need to define Nstar at all?). Because everything in this Stan program is just fixed to constant values, thereās nothing for Stan to do here.
Your use case may indeed be suitable for Stan but the way the program is currently written you donāt need Stan at all, like @dlakelan said. So I think we might still be misunderstanding what you want to get out of Stan. If youāre willing to keep trying to explain it Iām willing to keep listening and trying to help!
For that you can just use a vector instead of an integer array and it will let you use other distributions. If you need the object as both real numbers and integers then you can pass it in twice, using both the vector type and the int type. But thatās not going to matter here unless thereās something for Stan to estimate.
After some thinking and revisting the mark-recapture example in the Stan User Guide, I realize that I may be thinking too much on developing the most realistic model, while ignoring perhaps the most basic one.
Turns out that everything āworksā if I adopt a poisson likelihood on N, even though itās fully observed.
I suppose Iām struggling then to see how the Lincoln-Peterson mark-recapture Stan model is useful (or even works as expected) despite the number of recaptures ( R ) also being fully specified as data.
Hereās the link to the mark-recapture model in the Stan Userās Guide:
Hereās my Stan program.
data {
int<lower=2> N;
int<lower=1> Hstar;
real<lower=0,upper=Hstar> H;
int K; // for posterior predictive check
}
parameters {
real<lower=N> Nstar;
}
model {
N ~ poisson((Nstar * H) * 1.0 / Hstar);
}
generated quantities {
real Nstar_tilde[K];
for (i in 1:K) {
Nstar_tilde[i] = poisson_rng((Nstar * H) * 1.0 / Hstar); // posterior predictive check
}
}
Note above the inclusion of a generated quantities block to check that the model is doing a good job. As it stands, I simply return the statistic of interest, but should probably plot the individual histograms to see that the model is indeed capturing the same data as seen in the observations.