I’ve been working on the model below and talked to Ben recently as a couple of problems arose. Andrew also suggested simplifying the model to isolate the problem but I also didn’t find the problem that way (see below for some solutions I tried previously). I figured that it may not hurt to ask here as well.
The model is a Markov Switching Model with Time Varying Transition Probabilities, i.e. I assume that the parameters of the time series vary according to a random variable that follows a Markov process of order one whose transition probabilities are a function of an exogenous covariate (z) and an intercept (gamma).
Without TVTPs, the model can identify gamma. When the intercept and the exogenous covariate are included, the model can identity neither gamma nor lambda. Alpha and phi are on average within the 50% HDI but are still biased for N(simulations) = 50. Sometimes gamma[1]/[2] and lambda[1]/[2] have the same standard deviation or same absolute mean value even if the true values diverge widely. Here is just one model output when running the model with some simulated data for N = 1 and T = 300. The true values are alpha (2, 6), phi (0.2, 0.7), gamma (-2, 2), lambda (1, 0.2)
N sim name mean sd q25 q50 q75
1 1 1 alpha[1] 1.023479004 0.52433014 0.68470262 1.033403681 1.37288909
2 1 1 alpha[2] 5.836420795 0.24794741 5.67351637 5.834325272 5.99803102
3 1 1 phi[1] 0.225243761 0.03070192 0.20518439 0.224699188 0.24562940
4 1 1 phi[2] 0.715086473 0.01346963 0.70643604 0.715159144 0.72381077
5 1 1 gamma[1] -0.664868689 0.31815327 -0.87492839 -0.646834929 -0.45570295
6 1 1 gamma[2] 1.670333349 0.12871636 1.58098185 1.668498695 1.75781119
7 1 1 lambda[1] -0.010225889 1.04397066 -0.68672621 0.003543484 0.66562922
8 1 1 lambda[2] -0.002082519 0.10865527 -0.07782131 -0.000808923 0.07186009
My problem is that I don’t understand why this happens and that it is unclear how the model can provide reasonable inference for alpha and phi in the linear part of the model if it does not know in which state it currently is.
This problem does not disappear when only including lambda for one state, removing the cross-sectional part or varying the size of the parameters or number of time periods. I ran some other versions that removed phi and/or alpha to no avail. I am also currently running a simulation varying N between 1 and 100 but as the model is again very slow I don’t know yet whether that would improve the situation. I can provide the R code for the simulated data if necessary.
Thanks in advance for any ideas regarding this (or how to speed the model up).
Model
data {
int<lower = 0> T; // number of observed time periods - 1
int<lower = 0> N; // number of observed units
matrix[N, T] y_head; // head(y, T - 1)
matrix[N, T] y_tail; // tail(y, T - 1)
matrix[N, T] z; // exogenous random variable affecting transition probabilities
// real<lower = 0> sigma; // variance; fixed for now
}
parameters {
ordered[2] alpha;
vector<lower= 0,upper=1>[2] phi; //AR parameter
real gamma[2]; // intercepts in TPs
real lambda[2];
real<lower = 0> sigma; // variance term
}
transformed parameters {
real p11[N,T];
real p12[N,T];
real p21[N,T];
real p22[N,T];
real p_sprev_1_givenprev[N, T]; // P(S[t - 1] = 1 | omega[t - 1], theta)
real p_scur_1_givenprev[N, T]; // P(S[t] = 1 | omega[t - 1], theta)
real p_scur_1_givencur[N, T]; // P(S[t] = 1 | omega[t], theta)
vector[2] s[N, T];
real fy[N, T];
for (i in 1:N) {// looping over all observations
// 1st inner loop for initial values
for (t in 1:T) {
p11[i,t] = normal_cdf(gamma[1] + lambda[1] * z[i, t], 0, 1);
p12[i,t] = 1 - p11[i,t];
p22[i,t] = normal_cdf(gamma[2] + lambda[2] * z[i, t], 0, 1);
p21[i,t] = 1 - p22[i,t];
}
// t = 1
// Initial transition probabilities (Piger eq. 14)
// Can also treat initial transition probabilities as a parameter to be estimated
p_sprev_1_givenprev[i,1] = (1 - p22[i, 1]) / (2 - p11[i, 1] - p22[i,1]);
p_scur_1_givenprev[i, 1] = p11[i, 1] * p_sprev_1_givenprev[i, 1] +
p21[i, 1] * (1 - p_sprev_1_givenprev[i, 1]);
for (j in 1:2){
s[i, 1, j] = normal_lpdf(y_tail[i,1] | alpha[j] + phi[j] * y_head[i,1], sigma);
}
// Piger eq. 11
fy[i,1] = log_mix(p_scur_1_givenprev[i,1], s[i,1, 1], s[i,1, 2]);
// Piger eq. 13
p_scur_1_givencur[i,1] = exp(s[i,1, 1] + log(p_scur_1_givenprev[i,1]) - fy[i,1]);
// second inner loop for subsequent values
for (t in 2:T){
p_sprev_1_givenprev[i,t] = p_scur_1_givencur[i,(t-1)];
// Piger eq. 10
p_scur_1_givenprev[i,t] = p11[i, t] * p_sprev_1_givenprev[i ,t] +
p21[i, t] * (1 - p_sprev_1_givenprev[i,t]);
for (j in 1:2){
s[i,t,j] = normal_lpdf(y_tail[i, t] | alpha[j] + phi[j] * y_head[i,t], sigma);
}
// Piger eq. 11
fy[i,t] = log_mix(p_scur_1_givenprev[i,t], s[i,t, 1], s[i,t, 2]);
// Piger eq. 13
p_scur_1_givencur[i, t] = exp(s[i,t, 1] + log(p_scur_1_givenprev[i,t]) - fy[i,t]);
}
}
}
model {
// likelihood
for (i in 1:N) {
for (t in 1:T) {
target += fy[i,t];
}
}
// priors
gamma ~ normal(0, 1);
phi ~ uniform(0, 1);
alpha ~ normal(0,1);
lambda ~ normal(0, 1);
}