I am trying to figure out how to code up the log_lik variable in the “generated quantites” block for use with the LOO package, and therefore to choose between different model structures. The model that I have is a state-space model, where the states are represented by a random walk, and there is a gaussian observation term on top of it.
Now, I would like to use the PSIS-LOO approach to choose between different versions of this model (specifically how the observation process is modelled. But I’m very unsure what the correct way to write the generated data and log_lik term is. Should it include just the observation component of the likelihood? Or also the random walk component as well (I’m not sure how that would be coded up though).
More specifically, there are two elements to the model. Firstly, the underlying “state” of the system (S) is represented by a first-order random walk
S_{i+1} = S_{i} + \epsilon
\epsilon \sim N(0, \sigma_{RW})
Secondly is the observation layer, where the observations Y are linearly related to the state S via a gaussian observation model
Ŷ_i=\alpha S_i
Y_i \sim N(Ŷ_i,\sigma_{ObsErr})
The Stan code I am using is:
model {
//Observation component of likelihood
target += normal_lpdf(Y | Y_hat , sigmaObsErr);
//Random walk component
for(i in 2:n_years){
target += normal_lpdf(S[i] | S[i-1] ,sigmaRW);
}
}
I’m guessing that the log_lik that LOO needs should only contain the observation term, so that the generated quantites block then becomes:
generated quantities {
vector[n_obs] log_lik;
for(j in 1:n_obs) {
log_lik[j] = normal_lpdf(Y[j] | Y_hat[j] , sigmaObsErr);
}
}
But it’s only a guess!