LOO model selection for state space model

mark_payne · November 19, 2019, 6:46pm

I am trying to figure out how to code up the log_lik variable in the “generated quantites” block for use with the LOO package, and therefore to choose between different model structures. The model that I have is a state-space model, where the states are represented by a random walk, and there is a gaussian observation term on top of it.

Now, I would like to use the PSIS-LOO approach to choose between different versions of this model (specifically how the observation process is modelled. But I’m very unsure what the correct way to write the generated data and log_lik term is. Should it include just the observation component of the likelihood? Or also the random walk component as well (I’m not sure how that would be coded up though).

More specifically, there are two elements to the model. Firstly, the underlying “state” of the system (S) is represented by a first-order random walk

S_{i+1} = S_{i} + \epsilon

\epsilon \sim N(0, \sigma_{RW})

Secondly is the observation layer, where the observations Y are linearly related to the state S via a gaussian observation model

Ŷ_i=\alpha S_i

Y_i \sim N(Ŷ_i,\sigma_{ObsErr})

The Stan code I am using is:

model {
  //Observation component of likelihood
  target += normal_lpdf(Y | Y_hat , sigmaObsErr);
  
  //Random walk component
  for(i in 2:n_years){
    target += normal_lpdf(S[i] | S[i-1] ,sigmaRW);
  }
}

I’m guessing that the log_lik that LOO needs should only contain the observation term, so that the generated quantites block then becomes:

generated quantities {
  vector[n_obs] log_lik;
  for(j in 1:n_obs) {
    log_lik[j] = normal_lpdf(Y[j] | Y_hat[j] , sigmaObsErr);
  }
}

But it’s only a guess!

avehtari · November 20, 2019, 1:15pm

Correct.

You might be interested also in checking the papers and the code in

mark_payne · November 21, 2019, 12:59pm

Great! Thanks for the clarification.

Mark

Charles_Driver · November 21, 2019, 5:22pm

While we’re on the topic, I wonder if @avehtari has any thoughts on loo with a state space model with multiple observations per time point, and where the loglikelihood function filters the observations such that the loglikelihood is 0 for a time point with no observations. Just dropping the time point should be fine for the case where all obs are missing, but I couldn’t see an obvious solution when only some obs are missing.

avehtari · November 21, 2019, 6:21pm

Are you considering leave-one-obs-out or leave-one-time-point-out?

Charles_Driver · November 21, 2019, 6:22pm

leave one time point out

avehtari · November 21, 2019, 6:24pm

And how do you compute the likelihood now?

Charles_Driver · November 21, 2019, 6:26pm

filter the observation vector, prediction vector, and covariance matrix for missingness, then multivariate normal type stuff.

avehtari · November 21, 2019, 6:32pm

Are you modeling the missing observations (ie sampling them)?

Can you post the relevant code lines for the likelihood?

Charles_Driver · November 21, 2019, 6:35pm

No, missing observations are being filtered out. Code is complex, but the basic idea is that only the relevant indices from the predicted mean and covariance matrix are used, and this may vary from time point to time point. If it doesn’t make sense I’ll post some simplified code another day.

avehtari · November 21, 2019, 6:46pm

Code could be useful as I can’t figure out from this how many dimensions in the covariance matrix and how many observations, and is the multivariate just for one time point or all etc.

Charles_Driver · November 21, 2019, 7:12pm

Let’s assume the prediction vector (ypred) and covariance matrix (covm) for each time point is some function of the parameters and time. The rowwise likelihood function looks something like this, where ‘whichobs[t]’ is an array of integers specifying non missing observations at that t.

for(t in 1:maxT){
llrow[t] ~ multinormal(ypred[t, whichobs[t] ], covm[t,  whichobs[t], whichobs[t] )
}

If whichobs[t] in most cases specifies 10 non missing variables, but occasionally there is only 1 or two non missing, then I expect leave one time point out will break for the cases with 1 or 2. Not sure if there’s any straightforward fix, I sure can’t think of one.

avehtari · November 22, 2019, 5:07pm

Charles_Driver:

The rowwise likelihood function looks something like this, where ‘whichobs[t]’ is an array of integers specifying non missing observations at that t.
for(t in 1:maxT){
llrow[t] ~ multinormal(ypred[t, whichobs[t] ], covm[t,  whichobs[t], whichobs[t] )
}
If whichobs[t] in most cases specifies 10 non missing variables, but occasionally there is only 1 or two non missing, then I expect leave one time point out will break for the cases with 1 or 2. Not sure if there’s any straightforward fix, I sure can’t think of one.

You can use llrow for loo package and it’s then leave-one-time-point-out. loo package has diagnostics to tell whether this works or not. And soon it has iterative moment matching approach which makes it work in more cases.

Topic		Replies	Views
Log_lik for [estimated] missing data? General	5	985	July 11, 2017
State-space model - log_lik and loo when using the forward algorithm Modeling loo , ecology , capture-recapture	2	731	December 3, 2019
Log_lik for LOO vs lp__ Modeling loo	6	1985	October 3, 2017
Computation of WAIC and LOO for structured data Modeling loo	14	1616	September 9, 2019
Leave-future-out cross-validation for time-series models Modeling loo , validation	19	5241	November 11, 2020

LOO model selection for state space model

Related topics