I try to model poll estimates for multi-party elections with K parties over R elections. When I assume the poll estimates to be normally distributed, the trace plots look completely fine but as soon as I account for the dependency between poll estimates by applying a multivariate normal distribution, the trace plots look super confusing ending in a straight line. So here is an explanation of the model, the stan code and a screenshot of some traceplots (I only fit one chain to make the model run fater but when specifying more chains, all chains look like this and do not mix) .
The poll estimates for poll i are assumed to be multivariate normal distributed with a mean p and a covariance matrix \Sigma:
poll_i \sim mvn(p_i, \Sigma).
poll_i and p_i are both vectors of length K, the number of parties. The mean p_i is modeled as the sum of the election outcome plus a bias term \alpha_{r[i]}, which varies over parties and elections r. To ensure that the estimated poll means p_i will sum to one I apply the softmax()
-transformation (hence the mean estimantes are on the log scale, this is why I also take the log of the election outcome log(vote_{r[i]}). I know that it is not perfectly accurate to model variables on the log scale as normally distributed but this allows me to separate the mean from the variance equation).
For modelling the KxK covariance matrix \Sigma, I use the cholesky decomposition. As prior for the cholesky factor if the correlation matrix I specify 2 (wich seems to be a common choice) and the standard deviations \tau are given a half normal distribution.
So my question is whether I do something wrong modelling \Sigma or something else is going on. I would be very thankful for any ideas or help!
data {
int<lower=1> N; // number of party-poll estimates
int<lower=1> K; // number of unique parties
int<lower=1> R; // number of unique elections
int<lower=1> KR; // number of unique party-election combinations
int<lower=1> P; // number of unique polls
vector<lower=0, upper=1>[K] poll[P]; // poll support estimates for each poll and party
vector<lower=0, upper=1>[KR] vote; // election results for each election and party
int<lower=1, upper=R> r_id[P]; // election id for each poll
}
transformed data {
// convert election result to log scale
vector[K] log_vote[R];
for(i in 1:R){
log_vote[i] = log(vote[((i-1)*K+1): (i*K)]);
}
}
parameters {
// Hyper parameters mean model
real mu_alpha; // mean alpha (non-centered)
real<lower=0> sig_alpha; // sd alpha (non-centered)
vector[K] alpha_sc[R]; // scale alpha (non-centered)
// hyper parameters variance model
cholesky_factor_corr[K] lkj_corr; // cholesky factor of correlation matrix
real<lower=0> sig_tau; // sd tau (non-centered)
vector<lower=0>[K] tau_sc; // scale tau (non-centered)
}
transformed parameters{
// Parameters
vector[K] alpha[R];
vector<lower=0>[K] tau; // party standard deviation
// non-centered parameterization of estimated parameters
for(r in 1:R) {
alpha[r] = mu_alpha + sig_alpha * alpha_sc[r];
}
tau = sig_tau * tau_sc;
}
model {
// mean and covariance Matrix for multivarite normal distribution
vector[K] p[P];
matrix[K,K] Sigma;
// hyper priors
mu_alpha ~ normal(0,0.2);
sig_alpha ~ normal(0,0.2) T[0,];
sig_tau ~ normal(0,0.2) T[0,];
for(i in 1:R){
alpha_sc[i] ~ std_normal();
}
for(j in 1:K){
tau_sc[j] ~ normal(0,1) T[0,];
}
lkj_corr ~ lkj_corr_cholesky(2); // cholesky factor of correlation Matrix
// Choleky factor of covariance matrix
Sigma = diag_pre_multiply(tau, lkj_corr);
// mean model
for(i in 1:P){
p[i] = softmax(log_vote[r_id[i]] +
alpha[r_id[i]] );
}
// poll Estimated
for(i in 1:P){
poll[i] ~ multi_normal_cholesky(p[i], Sigma);
}
}