Help with unsual state space model


Hello, I am new to stan and state space models and have a conceptual and Stan modelling question.

Model Setting

I am trying to measure the decline of immature shrimp individuals in growth tanks so no reproduction just survival. At day 3 and 5 of each week I pull all live shrimp with a net and record their total mass. Then I calculate the number(N) of shrimp as:


N_t=\frac{(Shrimp ~total~Biomass)_t}{(mean~weight~shrimp)_t}


where mean weight of a shrimp is the average weight of a sample of n shrimps so the number of shrimp becomes a continuous variable. The process part of the model would be:


N_{t+\Delta}=N_{t}\theta_{t}^{\Delta}



where \theta_{t} is the daily survival bounded to [0,1]


\theta_{t} \sim Beta(\alpha,\beta)


and the measurement error:


Y_{t+\Delta} \sim N(N_{t+\Delta},\sigma^2)


Generated R Data

 set.seed(1000) 
 days=90
  phi=0.98;lambda=10
  alpha = lambda*phi; 
  beta = lambda*(1-phi)
  theta=rbeta(days-1,alpha,beta)
  N=rep(NA,90)
  N[1]=1000
  for(i in 1:89){
    N[i+1]=N[i]*theta[i]
  }
  y=N+rnorm(N,0,30)
  day_sampled=c(1,sort(c(seq(3,84,7),seq(7,84,7))))
  dt=diff(day_sampled) 
  y=y[day_sampled]

Stan Model

Have not thought about the prior but the model for one tank to be extended to several would be:


data { 
      int<lower=0> N; 
      real y[N];
      int NO;
      real dt[N-1];
    } 
    parameters { 
    real<lower=0,upper=1> phi; 
    real<lower=0, upper=100> lambda; // maybe model lambda as function of time ellapsed
    real <lower=0,upper=200> sigma_obs;
    vector<lower=0,upper=1>[NO] theta;
    real<lower=0,upper=4000> N_est1; // Initial N
  } 
 transformed parameters { 
    vector[N] N_est;
    N_est[1]=N_est1;
      for (k in 1:NO) {
        N_est[k+1]=N_est[k]*theta[k]^dt[k];
      }
  } 
  model { 
    theta ~ beta(lambda*phi,lambda*(1-phi));
    y ~ normal(N_est,sigma_obs);
  }  

 stsp_beta<-
    rstan::stan(model_code=model_beta,
                data = list(N=25,dt=dt,y=y,NO=24),
                chains = 4,cores=4,iter=30000,verbose = TRUE)

Is this model sensible from a conceptual point of view? and correctly specified in stan(wo looking at the priors)?

It looks it can estimate phi and sigma_obs, however lambda is overestimated, not sure if this is a identifiability problem or a syntax issue.

I think in addition the fact that there are different time intervals should impact
the process error, not sure if this could be modelled by modelling the scale parameter
as function of time elapsed.

Thank you for any help.

2 Likes

In the parameter block, I would not put upper limits on parameters that are not bounded naturally by the parameter space. For instance, I would change:

real<lower=0, upper=0> lambda;

to

real<lower=0> lambda;

Removing the upper bounds likely means you need to place a prior on those parameters. There’s some discussion here: Stan pedantic mode « Statistical Modeling, Causal Inference, and Social Science

Are the values of y discrete or continuous? I’ve tried to fit continuous models to somewhat discrete data (recorded at 0.5 intervals) and have had loads of issues fitting the models. Just a heads up.

There’s an overview of state-space models with Stan here: 9.1 Overview | Applied Time Series Analysis for Fisheries and Environmental Sciences I agree with the process error part, just not exactly sure what to do in this moment. Will try to come back once I’ve looked through a few things. Also not exactly sure why lambda might be overestimated. Perhaps a prior distribution that limits the amount of probability mass of say lambda > 50 or something could help?

3 Likes

Thank you very much for the help vianeylb. The y’s are continuous, the problem is that the measurement error causes sometimes to have Nt+1>Nt and hence theta>1. The time intervals on the other hand are discrete more less an approximation. I’ve tried a gamma(1,0.1) prior for lambda to have most of the mass to the left of 50, and I used simulations at dt=1 as I was concerned that not addressing the lambda parameter for longer dt could impact the parameter recovery. Given these conditions, I get back a lambda(bar)=18.24 vs. 10. Look forward to any other suggestions you may have.

1 Like

P.D. Actually given dt=1, it seems possible to recover lambda around 10, 18.24 (above) seems to be just by chance of the simulated values. However, given larger dt’s the retrieved lambda seems bigger.

1 Like

Hi vianeylb,

I’m really interested in the link you supplied on time series analysis for fisheries, but the link is broken. Is there another resource for fisheries SSMs in Stan?

Looks like it got move here: 13.5 Univariate state-space models | Applied Time Series Analysis for Fisheries and Environmental Sciences

State-space models is a really broad term when it comes to fisheries. I’ve seen mark-recapture models (i.e. CJS), integrated population models, and MARSS (which can also be described as dynamic linear models) all described as state-space models. But fitting each of these in Stan is quite different.

I have written a few state-space models in Stan. If you have a more specific model structure in mind, I might be able to direct you to some examples.

1 Like

Hey Dalton,

This is my problem. If you can assist, I would be immensely appreciative. All of my attempts to correct have failed, and I’m not sure how else to re-parameterize my model to capture the true values.

Thanks again,

Challen

https://atsa-es.github.io/atsa-labs/

If I remember correctly this is based in a time series and fisheries course in Washington state university

And if you search on YouTube you can find the lectures

Hey Fred,

Thank you so much. Im looking now.