Hi all,
Without getting too specific about my model (which is added at the end of this post) I am getting a result that I never had before. If I test-run my model on 1 chain, the MC error is very low, looking like it converges. However, when I do an actual run over multiple chains, the parameters do not converge. Looking at the traceplots, I can see that - indeed - per chain the MC error is very low again. Thus, this implies my model finds multiple solutions. See the attached picture;
These results thus imply that the results depend on my starting values. My question is thus very broad. What does this mean? Is it a lack of identification still? I don’t think throwing more iterations against it will help, right? Do you think this has to do with rotation? Even though I am not literally running a PCM, there are some similarities. And if so, would you now of an rotation example in stan (like Varimax) that I can easily implement.
Any help is welcome,
Alex
model_string = "
data {
int<lower=0> N; // number of observations
int<lower=0> Nt; // number of time periods
int<lower=0> Np; // number of explanatory variables
int<lower=0> Nh; // number of knots (only 1 layer)
int<lower=0> Ntest; // size of testing dataset
vector[N] yVar; // explained variable
int<lower=0,upper=Nt> sell[N]; // period sold, sell = 1, ..., Nt
matrix[N,Np] x; // explanatory variables
vector[Ntest] yTest; // test y variable
int<lower=0,upper=Nt> sTest[Ntest]; // test period sold
matrix[Ntest,Np] xTest; // testing dataset
}
transformed data {
matrix [Np,Np] A = diag_matrix(rep_vector(1,Np));
matrix [Np,Np-1] A_qr;
for(i in 1:Np-1)
A[Np,i] = -1;
A[Np,Np] = 0;
A_qr = qr_Q(A)[,1:(Np-1)];
}
parameters {
vector[Nt-1] innovMu; // time series parameter
real fMu; // first Mu, i.e. the constant.
ordered[Nh] beta; // bias, CONSTRAINT #1
matrix<lower=-10,upper=10>[Np-1,Nh] omega_clean; // weights in layer
vector[Nh] lambda; // estimates measurement Eq.
real<lower=0,upper=2> sigMu; // variance of innovations (mu)
real<lower=0,upper=2> sigEps; // RMSE of measurement Eq.
}
transformed parameters {
vector[N] yHat;
vector[Nt] mu;
matrix[N,Nh] h;
matrix[Np,Nh] omega;
for(k in 1:Nh)
omega[,k] = A_qr * omega_clean[,k];
mu[1] = fMu;
for(t in 2:Nt)
mu[t] = mu[t-1] + innovMu[t-1]*sigMu;
for(k in 1:Nh)
h[,k] = x*omega[,k] + beta[k];
yHat = mu[sell] + (inv_logit(h)*100)*lambda; // inv_logit()*100;
}
model {
lambda ~ normal(0,1);
beta ~ normal(0,1);
innovMu ~ normal(0,0.1);
fMu ~ normal(0,10);
sigEps ~ normal(0,0.5);
sigMu ~ normal(0,0.5);
for(k in 1:Nh)
sum(omega_clean[,k]) ~ normal(0,1/sqrt(1-inv(Np)));
yVar ~ normal(yHat, sigEps);
}
generated quantities {
vector[Ntest] yPred;
matrix[Ntest,Nh] hTest;
for(k in 1:Nh)
hTest[,k] = xTest*omega[,k] + beta[k];
yPred = yTest - (mu[sTest] + (inv_logit(hTest)*100)*lambda);
}"