Hello! I’d love some feedback on this very simple model which I hope to build on. This is my attempt at writing Stan code for one of ecology’s favourite curves, the so-called “Type II Functional response”:

\begin{align}
\text{attack} &\sim \text{Binomial}(p, \text{Density}) \\
p &= \frac{a}{1 + a \times h \times \text{Density}} \\
\end{align}

For the model to work and make sense, a has to be between 0 and 1, while h must be positive.

data {
int<lower=0> N;
int attacks[N];
int densities[N];
}
parameters {
real<lower=0,upper=1> a;
real<lower=0> h;
}
transformed parameters{
real<lower=0, upper = 1> prob_attack[N];
for (n in 1:N) prob_attack[n] = a * /(1 + a * (h+1) * densities[n]);
}
model {
a ~ beta(2,60);
h ~ lognormal(0, 1);
attacks ~ binomial(densities, prob_attack);
}

Here are some questions I’m struggling with, and I’d be so grateful for feedback on any/all of them

is this the right way to declare the types of the data (ie int NAME[size] ?) The manual uses a vector for predictors but array[size] int<lower=0> NAME for the response

I thought I would be able to vectorize the expression for prob_attack but it seems impossible. Are scalar parameters recycled along a vector?

I consistently get prob_attack[sym1__] is nan, but must be greater than or equal to 0.000000 errors. The sampler says that this can be ignored if it is sporadic but still – what causes this? is it worth switching to a logit-normal model to avoid it?

Thank you!

edit to fix an error in the equation (in the code): the Density does not appear in the numerator when using a binomial likelihood

Integer data types cannot be stored in vectors, and so must be stored in arrays. int NAME[size] is the right way to declare an integer array in previous Stan versions, including the version of rstan that is currently on CRAN. However, it is deprecated (but still works) in the latest Stan versions, and is replaced by array[size] int NAME.

This should definitely be vectorizable. What goes wrong when you try, and what Stan code are you trying with?

I don’t immediately see where this error could come from. If some of your densities are truly huge, this could conceivably be the problem? Like if densities/(1 + a * (h + 1) * densities) turns into infinity divided by infinity. But I assume you don’t have densities that are indistinguishable from infinity?

Here is an updated version of the model, followed by code to fit it with dummy data. To vectorize the calculation I had to

use a for-loop in transformed data to duplicate the densities as real numbers and ALSO to store them in a vector, not an array. This was necessary so that the * operator would work (seemingly?)

using ./ rather than simply / for division. Otherwise I would get the error:

Ill-typed arguments supplied to infix operator /. Available signatures:
...
Instead supplied arguments of incompatible type: real, vector.

Should I be using another structure? Matrix perhaps?

data {
int<lower=0> N;
array[N] int<lower=0> attacks;
array[N] int<lower=0> densities;
}
transformed data{
vector[N] densreal;
for (n in 1:N) densreal[n] = densities[n] * 1.0;
}
parameters {
real<lower=0,upper=1> a;
real<lower=0> h;
}
transformed parameters{
vector<lower=0, upper = 1>[N] prob_attack;
prob_attack = a ./ (1 + a * h * densreal);
}
model {
a ~ beta(2,6);
h ~ lognormal(0, 1);
attacks ~ binomial(densities, prob_attack);
}

and here is R code for making test data (assuming the model saved in a “simple_fr.stan” file)

thanks @WardBrian ! that works perfectly. Would you still create a new vector to store it in, as I did with densreal, or do it live in the fuction?
I find it works with prob_attack = a ./ (1 + a * h * to_vector(densities));
But I wonder if there is a best practice here.

on that topic: why is ./ necessary, when .* is not?

If you’re able to do it ahead of time in a place like transformed_data, that has minor performance advantages over doing it in a place where it would be evaluated over and over.

The inconsistency between ./ and .* is a known issue. Generally the . in front means elementwise, but for scalars everything is generally interpreted as elementwise so we should allow you to not require it, but at the moment we ended up with an inconsistent choice

Sorry to keep piling on here! The following model samples beautifully and doesn’t get those errors regarding nan, but must be greater than 0.00000 errors. Is using inv like this therefore the right thing to do?

data {
int<lower=0> N;
array[N] int<lower=0> attacks;
array[N] int<lower=0> densities;
}
parameters {
real<lower=0,upper=1> a;
real<lower=0> h;
}
transformed parameters{
vector<lower=0, upper = 1>[N] prob_attack;
prob_attack = a * inv(1 + a * h * to_vector(densities));
}
model {
a ~ beta(2,6);
h ~ lognormal(0, 1);
attacks ~ binomial(densities, prob_attack);
}