Expanding population estimate from single site to multiple. Adding multiple sites gives error 'Ill-typed arguments supplied to infix operator /."

I am very new to Stan, and hopefully this is a simple newb error. As part of a larger hierarchical model, I am estimating population size (PE) at multiple sites. I have no way to track individual captures, so this is based on a multiple census mark recapture Schnabel estimate:
\hat{PE}=\frac{\sum{C_t*M_t}}{R_t}
where C_t is the number of captures during sampling event t, M_t is the number of marked individuals at large before sampling event t, and R_t is the number of recaptures in sampling event t.

I can successfully run population estimates for one lake at a time with this model:

data {
  // number of sampling events
  int<lower=0> T;
  // vector of catch at time t * marks out at time t
  array[T] int CtMt;
  // recaps at each sampling event t
  array[T] int Rt;
}

// The parameters accepted by the model. 
parameters {
  // PE can't be an integer because Stan, so real parameter with lower limit of sum of recaps
  real<lower=(sum(Rt))> PE;
}

// Rearranged Schnaebel as Poisson process
model {
  sum(Rt)~poisson(sum(CtMt)/PE);

  PE~gamma(0.001, 0.001);

}

When I try to expand to estimating PEs for multiple sites, the model looks like this and throws an error:


data {
  // total number of lakes
  int<lower=1> L;

  // array (because integer) of catch at sampling event t * marks at large at
  //time t
  // data now has one row per lake instead of one row per sampling event--
  //condensed into sum CtMt and sumRt
  array[L] int sumCtMt;
  // recaps at each sampling event t
  array[L] int sumRt;
  
}

parameters {
  // I want one estimate of PE for each lake L
  // where I'm stuck: How do I index lakes with multiple observations? 
  //Maybe just calculate it first--one row per lake with sum(CtMt), sum(Rt)
  // int parameters not allowed
  real<lower=sumRt[L]> PE[L];
}

// The model to be estimated. 
model {
  // both of these versions (looped and vectorized) get same 'ill-typed arguments' error
  
// for(i in 1:L){
//    PE[L]~gamma(0.001, 0.001);
//    sumRt[L]~poisson(to_vector(sumCtMt[L])/PE[L]);
//  }
    PE~gamma(0.0001, 0.0001);
    sumRt~poisson(to_vector(sumCtMt)/PE);
}

The error is “Ill-typed arguments supplied to infix operator /.”
I know that I can’t divide an array by a real, but I thought that to_vector would help with that. Does anyone know what’s going wrong here? I appreciate any help.

Hi

PE seems to be an array as well. For element wise operations use ./ operator. / is for matrix/vector division (i.e. similar to computing inverses). I am unsure if stan support element-wise array operations through the use of the dot operator but you can try this before converting to vector. An alternative would be to declare you current arrays as vector instead, hence eliminating the need to use the to_vector function.

1 Like

The relevant parts of your code are:

array[L] int sumCtMt;
...
real<lower=sumRt[L]> PE[L];
...
sumRt~poisson(to_vector(sumCtMt)/PE);

The problem here is that PE is also declared as an array. If you want to give it the elementwise product, that’d be:

to_vector(sumCtMt) ./ to_vector(PE)

where you see the ./ for elementwise division. You also mixed the old-style array syntax in declaring PE with the new syntax in defining the data variables. The one you used for PE is no longer supported and won’t compile in our latest versions.

A better solution is to just declare them as vectors to begin with:

vector[L] sumCtMt;
...
vector<lower=sumRt[L]>[L] PE;
...
sumRt ~ poisson(sumCtMt ./ PE);

Declaring sumCtMt will cast the integers to real numbers automatically so that everything’s the right type for the division.

Do you really want the lower bound to be sumRt[L], which is the last element of the sumRt array? Or do you want the sums to all be lower bounds/.

Also, why are you lower-bounding PE? It’s a parameter. If the data is consistent with a value near the lower bound, you’ll run into both computational and statistical issues.

1 Like

Hi Bob,
Thank you for your reply! I didn’t have a good reason for specifying a lower bound, I just thought that was what you were supposed to do. Thanks for setting me straight.

I made the changes you suggested. I think I am now running into a different problem, where poisson() wants an array of integers as its first argument and won’t accept a vector. I thought there might be a way to divide sumCtMt and PE outside of poisson() to skip around that issue, but I haven’t found a way to make that work.

Sorry if this is a frustratingly dumb series of questions.

data {
  // total number of lakes
  int<lower=1> L;
  // data now has one row per lake instead of one row per sampling event--
  //condensed into sum CtMt and sumRt
  vector[L] sumCtMt;
  // recaps at each sampling event t
  vector[L] sumRt;
  
}


parameters {
  // I want one estimate of PE for each lake L
  vector[L]  PE;

}

// The model to be estimated. 
model {
    PE~gamma(0.001, 0.001);
    sumRt~poisson(sumCtMt ./ PE);

}

The poisson distribution expects the random variable, sumRt in this case, to be an interger. The declaration of:

should hence be kept as array[L] int sumRt as previously (entries in vectors in stan are by default real data types, not integers)

1 Like

I appreciate the help from everyone in the thread! Here’s what ultimately ended up working:


data {
  // total number of lakes
  int<lower=1> L;
  // data now has one row per lake instead of one row per sampling event--
  //condensed into sum CtMt and sumRt
  vector[L] sumCtMt;
  // recaps at each sampling event t
 int sumRt[L];
}


parameters {
  // I want one estimate of PE for each lake L
  vector<lower=0>[L]  PE;

}

// The model to be estimated. 
model {
    PE~gamma(0.001, 0.001);
    sumRt~poisson(sumCtMt ./ PE);

}


I think I did need to keep the lower bound on PE in this case. That value really should be an integer (as it’s an estimate of population), but Stan won’t estimate integer parameters. In this case though, I was never going to have a PE value close to that lower bound because all of the populations have at least hundreds of individuals.