High dimension of multivariate normal -- general question and machine precision issue

Hello everyone,

I am new to the Stan but I am loving it so far. This is also my first serious attempt at Bayesian model fitting but I have to say, Stan User Manual is excellent!

I am working on estimating relatively complicated structural economic model that boils down to a state space model. The way the model works is that we have large number of observations per period that follow multivariate normal. However, the covariance matrix that depends on relatively few ``structural" parameters of interest. I have tried to implement it naively in Stan: (1) sample the structural parameters (2) calculate variance-covariance matrix (3) sample the measurement equation.

Now, when I restrict the number of observations to about 60 per period the model estimates. However, with higher numbers there seems to be a machine precision issue. The model will not start. Below is my current code. I also recreated the matrix in generated quantities, extracted it to R and and checked that the deviations from symmetry are on the order of magnitude of e-17.

I am wondering if there is something I am doing wrong or could change. Or is more than 60 dimension of multivariate normal just not possible to work with? I would appreciate any feedback!

I believe my question is related to this thread

Code:

data {
  int T; // number of time periods
  int N; // number of members
  int K; // number of topics
  matrix[T,N*K] Y; // observations
}


transformed data{
vector[N*K] Y_array[T]; // transform to array for sampling purposes
for(t in 1:T){
    Y_array[t] = Y[t]';
  }
}

// For now equal psi coefs are enforced

parameters {
  vector[T] theta; // yesterday's state
  real<lower = 0> sigma_theta; // sd of state innovations
  real<lower = 0, upper = 1> rho; // AR coefficient on states
  vector[K] psi; // parameters means
  vector<lower = 0>[N] sigma_i;
  real<lower = 0> sigma_hat;
}


model{
vector[N*K] Psi;
matrix[N*K,N*K] R; // var-cov matrix
vector[N] lambda;
matrix[N,N] Gamma; // helper matrix
vector[N*K] nu[T]; // T-dim array of vectors of length N*K (contribution of each state to the mean)

sigma_theta ~ gamma(1, 1);
sigma_theta ~ gamma(1, 1.5);
for(i in 1:N) sigma_i[i] ~ gamma(1,0.5);
rho ~ uniform(0,1);
for(i in 1:K) psi[i] ~ normal(0,2);


for(i in 1:N){
  lambda[i] = (sigma_hat^2+ sigma_theta^2 + sigma_i[i]^2)/(sigma_hat^2+ sigma_theta^2);
}

for(i in 0:(N-1)){
      Psi[(1+K*i):(K+K*i)] = psi;
  }

for(i in 1:N){
  for(j in 1:N){
    Gamma[j,i] = (1-lambda[i])*(1-lambda[j])*rho*sigma_hat^2 + lambda[i]*lambda[j]*sigma_theta^2;
    if(j == i)
      Gamma[j,i] +=   lambda[i]^2*(sigma_i[i]^2);
    for(k in 1:K){
      for(k_prime in 1:K){
        R[(j-1)*K + k,(i-1)*K + k_prime] = psi[k]*psi[k_prime]*Gamma[j,i];
      }
    }
  }
}

theta[1] ~ normal(0,sigma_theta);
theta[2:T] ~ normal(rho*theta[1:(T-1)],sigma_theta);
for(t in 1:T){
  nu[t] = Psi*theta[t];
}
Y_array ~  multi_normal(nu, R);
}

Any moderate sized covariance matrix has a good chance to become numerically problematic for some values of the structural parameters that Stan lands on during the leapfrom process. Basically, the advice is

  1. Parameterize things in terms of a Cholesky factor of a covariance matrix when possible
  2. If the error is due to numerical asymmetry, you can do something like R = 0.5 * (R + transpose(R)) after filling it.
  3. If the error is due to numerical indefiniteness, you can add a tiny positive constant to the diagonal of R after filling it or do something more complicated .
1 Like