Hello, I am new to stan and state space models and have a conceptual and Stan modelling question.
Model Setting
I am trying to measure the decline of immature shrimp individuals in growth tanks so no reproduction just survival. At day 3 and 5 of each week I pull all live shrimp with a net and record their total mass. Then I calculate the number(N) of shrimp as:
N_t=\frac{(Shrimp ~total~Biomass)_t}{(mean~weight~shrimp)_t}
where mean weight of a shrimp is the average weight of a sample of n shrimps so the number of shrimp becomes a continuous variable. The process part of the model would be:
N_{t+\Delta}=N_{t}\theta_{t}^{\Delta}
where \theta_{t} is the daily survival bounded to [0,1]
\theta_{t} \sim Beta(\alpha,\beta)
and the measurement error:
Y_{t+\Delta} \sim N(N_{t+\Delta},\sigma^2)
Generated R Data
set.seed(1000)
days=90
phi=0.98;lambda=10
alpha = lambda*phi;
beta = lambda*(1-phi)
theta=rbeta(days-1,alpha,beta)
N=rep(NA,90)
N[1]=1000
for(i in 1:89){
N[i+1]=N[i]*theta[i]
}
y=N+rnorm(N,0,30)
day_sampled=c(1,sort(c(seq(3,84,7),seq(7,84,7))))
dt=diff(day_sampled)
y=y[day_sampled]
Stan Model
Have not thought about the prior but the model for one tank to be extended to several would be:
data {
int<lower=0> N;
real y[N];
int NO;
real dt[N-1];
}
parameters {
real<lower=0,upper=1> phi;
real<lower=0, upper=100> lambda; // maybe model lambda as function of time ellapsed
real <lower=0,upper=200> sigma_obs;
vector<lower=0,upper=1>[NO] theta;
real<lower=0,upper=4000> N_est1; // Initial N
}
transformed parameters {
vector[N] N_est;
N_est[1]=N_est1;
for (k in 1:NO) {
N_est[k+1]=N_est[k]*theta[k]^dt[k];
}
}
model {
theta ~ beta(lambda*phi,lambda*(1-phi));
y ~ normal(N_est,sigma_obs);
}
stsp_beta<-
rstan::stan(model_code=model_beta,
data = list(N=25,dt=dt,y=y,NO=24),
chains = 4,cores=4,iter=30000,verbose = TRUE)
Is this model sensible from a conceptual point of view? and correctly specified in stan(wo looking at the priors)?
It looks it can estimate phi and sigma_obs, however lambda is overestimated, not sure if this is a identifiability problem or a syntax issue.
I think in addition the fact that there are different time intervals should impact
the process error, not sure if this could be modelled by modelling the scale parameter
as function of time elapsed.
Thank you for any help.