Improving state-space fishery population model

Glad to hear the blogpost helps! I should update that thing someday. @vianeylb has been leading the charge on setting up a stan ecology community that hope fully can host some examples along these lines. Creating a logistic population model example for that is on my to-do list for someday…

Along with the suggestions you put up, a few things that help are

  1. Giving the model a gradient to push against when the population starts to crash , rather than just saying biomass must be greater than or equal to 0

  2. Estimate a vector of fishing mortality rates and use those to fit to the catch data, rather than treating the catch data as constants in the model. This ensures that biomass never goes below 0 (assuming F is between 0 and less than 1). This requires a estimating a lot of fishing mortality rates which can slow things down, and requires specifying a prior on observation error for the catches, and a prior on the fishing mortality rates. If you’re including process error and observation error on the index of abundance, you’ll probably need to just fix the observation error of the catches at some low level. If you’re estimating process error then you need to give some thought to the ratio of the process error to the variance of the fishing mortality rates, since letting both be totally free till cause problems . You can for example allow process error to be pretty large but constrain fishing mortality rates to a random walk style process with minimal variation year-to-year.

  3. Consider prior predictive tuning. This has helped us a lot for batch running of surplus production models. Suppose that you set some generic and wide lognormal prior on the carrying capacity of the population k. Since a bunch of those values will be close to zero, it’s likely that a large part of the prior parameter space will result in the population crashing, making life difficult for the model. What you can do is generate a bunch of draws from the prior conditional on the model (think of it as the post-model-pre-data distribution), where the model is the combination of the functional form of the population model (e.g. logistic growth) combined with the catch history that is treated as a constant in the model (i.e. it’s not in the likelihood). From there, you can adjust your priors to keep values that don’t result in the population crashing, which is the one thing we knew didn’t happen (assuming that the population still exists and never stopped existing). For example you can use this process to find growth rate r and k priors that generally speaking don’t crash the population. This won’t get rid of crashes in the population, there may be combinations of r, k, and process error values that produce population levels smaller than catches, but it will help the model spend more time in feasible parts of the parameters space. To take the example from the stan documentaion linked to above, just as it wouldn’t make sense to have a prior on soccer scores that has the model spend a bunch of time around values like 1e6, we want to avoid priors in the population model that spend a bunch of time with biomass < 0.

But yeah I agree we need more worked examples in ecological problems where broad default priors and model setup don’t work too well!