Difficulties with Hierarchical State Space Model

Hello community,

I’ve been having some trouble with this model, so I’m hoping someone can point out where I’ve gone wrong. I’ve suspected it has something to do with how I’ve set the priors, as in some cases (using print statements to observe values) it seems to be ignoring the priors entirely (e.g., h[j,1] all being given values ~1 when they should be ~13000 given the prior distribution). I’m now having trouble even getting it to initialize, receiving an “Error evaluating the log probability at the initial value” for values of sigma_h being negative, even though I’ve already set the constraints for all the relevant objects to <lower=0>.

I’ve used arrays of row vectors here for efficiency, but from testing the alternatives I seem to have the same issues with matrices and two-dimensional arrays.

Also, I’ve already implemented non-centered parameterizations of b_hh and sigma_h here, as you’ll see, and have played around with doing so for h as well, but that introduces more issues so I tried to keep the example simple. If someone could specify how they would implement the log-normal non-centered parameterization of h in this case, I would greatly appreciate being able to check my attempt against it.

data {
  int<lower=0> T;                                  // number of time points
  int<lower=0> J;                                  // number of individuals
  array[J] row_vector<lower=0>[T] h_obs;           // observed h
  array[J] row_vector<lower=0>[T] h_sd;            // h measurement error
}

parameters {
  array[J] row_vector<lower=0>[T] h;               // h true state value
  vector[J] b_hh_raw;                              // effect of prior state, by individual
  array[J] row_vector[T] sigma_h_raw;              // process error, by individual and time point
  
  // hyperparameters
  real mu_b_hh;                                    // mean effect of prior state
  real<lower=0> sigma_b_hh;                        // sd effect of prior state
  vector<lower=0>[J] mu_sigma_h;                   // mean process error, by individual
  vector<lower=0>[J] sigma_sigma_h;                // sd process error, by individual
}

transformed parameters {
  vector[J] b_hh;
  array[J] row_vector<lower=0>[T] sigma_h;
  
  for (j in 1:J) {
    b_hh[j] = mu_b_hh + b_hh_raw[j] * sigma_b_hh;
    for (t in 1:T) {
      sigma_h[j,t] = mu_sigma_h[j] + sigma_h_raw[j,t] * sigma_sigma_h[j];
    }
  }
}

model {
  // priors
  for (j in 1:J) {
    b_hh_raw[j] ~ std_normal();
    for (t in 1:T) {
      sigma_h_raw[j,t] ~ std_normal();
    }
  }
  
  //    hyperpriors
  mu_b_hh ~ normal(1, 0.4);
  sigma_b_hh ~ exponential(6);
  for (j in 1:J) {
    mu_sigma_h[j] ~ lognormal(5, 1);
    sigma_sigma_h[j] ~ lognormal(3, 1);
  }
  
  // likelihoods
  //   initial state
  for (j in 1:J) {
    h[j,1] ~ lognormal(9, 1);
  }
  
  //    propagate state
  for (j in 1:J) {
    for (t in 2:T) {
      h[j,t] ~ lognormal(b_hh[j] * log(h[j,t-1]), log(sigma_h[j,t]));
    }
  }
  
  //    observations
  for (j in 1:J) {
    for (t in 1:T) {
      h_obs[j,t] ~ normal(h[j,t], h_sd[j,t]);
    }
  }
}

Thanks! I greatly appreciate any help.

@Charles_Driver

I always find this kind of thing way harder to work with when things are not ‘normal’. Easy to get something wrong. The state propagation looks pretty weird to me, if I was struggling with this model I would make the states normal and deal with the non normality on the measurement layer.

1 Like

Thanks for your reply. I have played around with the states being normal as well. Changing the state propagation line to:

h[j,t] ~ normal(b_hh[j] * h[j,t-1], sigma_h[j,t]);

does not seem to help. I’m still getting initial errors that sigma_h is negative when it must be positive. Given the <lower=0> restrictions on it and its mu and sigma hyperparameters, and the lognormal priors set on them, they really should not be anywhere near negative.

Could you elaborate on what else about the propagation looks weird to you? It may help me understand the issue. Thanks again for any help you can provide.

Don’t have time to really make sense of it but sigma_h looks like a disturbance term but it’s being used as a standard deviation…

1 Like

lognormal priors are typically very wide. The 95% central interval is roughly (1000, 60000) and a value of 1 is 9 standard deviations out from the mean. That can get dominated with lots of data.

You probably don’t want to be applying log to sigma_h as the result has to be positive. The only way to ensure that is constrain sigma_h > 1. Instead, I think you want to apply exp, which will convert your unconstrained value (it’s a regression but only constrained to be > 0).

You can also make it shorter and more efficient by vectorizing. That’d replace this:

with

h[ , 1] ~ lognormal(9, 1);
for (t in 2:T)
  h[j] ~ lognormal(b_hh .* log(h[ , t - 1]), log(sigma_h[ , t]));
to_vector(h_obs) ~ normal(to_vector(h), to_vector(h_sd));