Incorporating survey-derived abundance indices (with posterior uncertainty) into a Bayesian SSM

Hi all,

I have a conceptual question about how to incorporate fisheries-independent survey indices of abundance (with full posterior uncertainty) as response data in a state-space model (SSM).


Here is the premise: each year, we conduct a fisheries-independent survey that samples ~1,500 stations using a stratified random design. The survey occurs in winter when the target species (crabs) enter torpor due to cold water temperatures, allowing us to use a relatively simple observation model to estimate annual indices of abundance.

I can provide the specific survey model if that’s helpful, but conceptually, the survey produces posterior draws of annual abundance indices for two stages (juveniles and adults) across ~30 years.


I’d like to use these survey-derived indices as the observation component in a stage-based population dynamics SSM (two stages: juveniles and adults).

The “ideal” setup, as I understand it, would be to fit a single, fully hierarchical Bayesian model that combines:

  1. the survey observation models, and
  2. the stage-structured SSM process model.

That way, uncertainty from the survey models would naturally propagate through to the population process.

However, fitting such a joint model would require embedding both survey models (each with ≈50,000 total observations) inside the SSM, which seems computationally infeasible.


Instead, I’m considering a two-stage approach:

  1. Fit each survey model separately to obtain posterior draws of the annual indices.
  2. Fit a parametric distribution (e.g., lognormal) to those posterior draws for each survey × year × stage combination.
  3. Use those fitted distributions as the SSM observation likelihood.

This would dramatically reduce computation time, while still allowing uncertainty in the indices to be passed to the population model.

Does this approach seem reasonable? Any help/guidance would be appreciated!

-Challen

2 Likes

This is valid as long as two conditions hold:

  1. The full joint model must be correctly specified.
  2. The parametric approximations to the observation-model posteriors are sufficiently good approximations to the likelihood function.

You can think of the joint model as a hierarchical one, in which the likelihood function conditional on the latent true abundance is given by the observation model. What you are doing is substituting your approximate posterior for that likelihood function. To the extent that the posterior and the likelihood function are the same, this is literally the same model.

Edit: I said it’s literally the same model, and this true for most intents and purposes. Note that we’ve dropped a bunch of constant terms in the likelihood by replacing the (unnormalized) likelihood terms with (normalized) posteriors. For fitting purposes this doesn’t matter. For model comparison purposes some care is warranted, most especially if comparing the full joint model to a two-stage model. Just make sure to formulate the likelihoods based on the full joint model for both models!

1 Like

Thanks for the fast response @jsocolar.

You can think of the joint model as a hierarchical one, in which the likelihood function conditional on the latent true abundance is given by the observation model. What you are doing is substituting your approximate posterior for that likelihood function. To the extent that the posterior and the likelihood function are the same, this is literally the same model.

This was also my interpretation. I was just hoping it was correct, as I hadn’t seen something like this been done before in literature. The parametric approximations appear to match the posterior predictive distributions for the annual indices quite well. Thank you for confirming!