Flatten the Covariance Matrix

To implement the idea of filling the sigmas_e vector by flattening the covariance matrices (sigmas_t) in the Stan code, I follow this structure:
But I have this problem:

Error in stanc(filename, allow_undefined = TRUE) : 0
Semantic error in 'string', line 49, column 4 to column 76:
   -------------------------------------------------
    47:  
    48:      // Flatten sigmas_t into sigmas_e
    49:      sigmas_e[cov_idx:(cov_idx + nst[t] * nst[t] - 1)] = to_vector(sigmas_t);
             ^
    50:      cov_idx += nst[t] * nst[t];  // Update cov_idx to the new position
    51:  
   -------------------------------------------------

Cannot assign to global variable 'sigmas_e' declared in previous blocks.

Stan code:

data {
  int<lower=0> nb;               // Number of observations (non-missing)
  int<lower=0> nt;               // Number of time steps (years)
  int<lower=0> np;               // Number of predictors (tree-ring chronologies)
  int<lower=0> ns;               // Number of streamflow sites
  int<lower=0> ncvm;             // Size of the covariance matrix vector
  
  matrix[nt, np] x;              // Predictor matrix (time x predictors)
  vector[nb] y;                  // Response vector (non-missing values)
  
  int<lower=1, upper=ns> site[nb];      // Streamflow station indices
  int<lower=1> nst[nt];                 // Number of stations per year
  int<lower=1, upper=nb> start_yrs[nt]; // Start year indices
  int<lower=1, upper=nb> end_yrs[nt];   // End year indices
}

transformed data {
  vector<lower=0>[nt] df_e;  // Degrees of freedom for inverse Wishart
  for (t in 1:nt) {
    df_e[t] = nst[t] + 1;    // Degrees of freedom for covariance matrices
  }
}

parameters {
  vector[ns] alphas;             // Intercepts for each site
  matrix[ns, np] betas;          // Coefficients for predictors
  vector[ncvm] sigmas_e;         // Flattened covariance matrices
}

model {
  // Priors
  alphas ~ normal(0, 100);
  to_vector(betas) ~ normal(0, 100);
  

  // Initialize index for sigmas_e
  int cov_idx = 1;  // Start index for sigmas_e
  
  // Loop over time steps (years)
  for (t in 1:nt) {
    int start_idx = start_yrs[t];
    int end_idx = end_yrs[t];
    int obs_count = end_idx - start_idx + 1;
    vector[nst[t] * nst[t]] sigmas_v;
    
    // Create covariance matrix for current year
    matrix[nst[t], nst[t]] sigmas_t;
    
    // Priors for the covariance matrices
    sigmas_t ~ inv_wishart(df_e[t], diag_matrix(rep_vector(1.0, nst[t])));  // Year-specific covariance prior
    
    // Flatten sigmas_t into sigmas_e
    sigmas_e[cov_idx:(cov_idx + nst[t] * nst[t] - 1)] = to_vector(sigmas_t);
    cov_idx += nst[t] * nst[t];  // Update cov_idx to the new position
    
    

    // Cholesky decomposition
    matrix[ns, ns] L_e = cholesky_decompose(sigmas_t);

    // Build the mean vector for the current year
    vector[obs_count] mu;
    for (i in start_idx:end_idx) {
      int s = site[i];
      mu[i - start_idx + 1] = alphas[s] + dot_product(betas[s], x[t]);
    }

    // Multivariate normal likelihood for the current year
    y[start_idx:end_idx] ~ multi_normal_cholesky(mu, L_e);
  }
}

The issue is that you’re defining sigmas_e in the parameter block. In Stan, everything defined there is sampled and so cannot be overwritten. You can define it at the beginning of the model block instead to solve the issue.

1 Like

Thank you so much
Can you suggest how to solve this issue?

Yes, this is what @simonbrauer was recommending:

model {
  vector[ncvm] sigmas_e;  
  for (t in 1:nt) {
    sigmas_e[cov_idx:(cov_idx + nst[t] * nst[t] - 1)] = to_vector(sigmas_t);
  }
  ...
1 Like

Thank you so much Bob

[edit: escaped code]

I correct the Stan code but not covregred more that 3:

data {
  int<lower=0> nb;               // Number of observations (non-missing)
  int<lower=0> nt;               // Number of time steps (years)
  int<lower=0> np;               // Number of predictors (tree-ring chronologies)
  int<lower=0> ns;               // Number of streamflow sites
 
  matrix[nt, np] x;              // Predictor matrix (time x predictors)
  vector[nb] y;                  // Response vector (non-missing values)
  
  matrix[ns, ns] Wp_e; // Prior covariance matrix
  
  int<lower=1, upper=ns> site[nb];      // Streamflow station indices
  int<lower=1> nst[nt];                 // Number of stations per year
  int<lower=1, upper=nb> start_yrs[nt]; // Start year indices
  int<lower=1, upper=nb> end_yrs[nt];   // End year indices
}

transformed data {
  real <lower=0> df_e = ns + 1;
}

parameters {
  vector[ns] alphas;             // Intercepts for each site
  matrix[ns, np] betas;          // Coefficients for predictors
  cov_matrix [ns] sigmas_e;      // Covariance matrix for streamflow sites
}

model {
 // matrix [nt, ns] mu;
  
  // Priors
  alphas ~ normal(0, 100);
  to_vector(betas) ~ normal(0, 10);
  sigmas_e ~ inv_wishart(df_e, Wp_e);
  

  // Loop over time steps (years)
  for (t in 1:nt) {
    int start_idx = start_yrs[t];
    int end_idx = end_yrs[t];
    int obs_count = end_idx - start_idx + 1;
    int nst_s = nst[t];
    
    // Extract the relevant subset of the covariance matrix for the current year
    matrix[nst_s, nst_s] sigmas_t = sigmas_e[site[start_idx:end_idx], site[start_idx:end_idx]];
   
   // Cholesky decomposition
       matrix[nst_s, nst_s] L_e = cholesky_decompose(sigmas_t);
                
      // Build the mean vector for the current year
      vector[obs_count] mu = alphas[site[start_idx:end_idx]] + rows_dot_product(betas[site[start_idx:end_idx]], rep_matrix(x[t], nst_s));

       //vector[obs_count] mu;
      // for (i in start_idx:end_idx) {
        //   int s = site[i]; // Get the station index
        //    mu[i - start_idx + 1] = alphas[s] + dot_product(betas[s], x[t]); // Compute the mean for site s
       //    }
                
     // Multivariate normal likelihood for the current year
        y[start_idx:end_idx] ~ multi_normal_cholesky(mu, L_e);

 
  }
}

The max(summary(fit)$summary[,“Rhat”]) > 3
Would you please give me your suggestion?

I’m afraid I don’t have time to really dive in and debug your code. There’s nothing obviously wrong about it at first glance that jumps out at me.

I would try two things, depending on what’s possible:

  1. Start with the simplest possible working model (e.g., no hierarchical priors, no covariance modeling, etc.), get that fitting, then build up incrementally to a more complex model. That way, you will see where things go off the rails.

  2. Look at the posterior and see which variables are failing. Are you sure with all of that indexing that you’re putting priors on all the continuous parameters? If not, HMC is going to fail. You can add print("variable A = ", A) type statements to the code to inspect what’s happening as the log density is being built up.

Also, the way you’re coding the multivariate normal isn’t helpful. You can just use a regular multi_normal, which will do the decompositions internally. There’s nothing to be gained doing it yourself. Also, it’s going to be hard to do anything efficiently if each observation gets its own covariance matrix. It can also be very hard to fit if they don’t share a lot of parameters—it winds up being very prior driven.

Unless you really believe in your wishers prior, I’d recommend coding with Cholesky factors and using a combination of LKJ priors on the correlations and independent priors on the scales (see the regression chapter of the User’s Guide for instructions). But again, I wouldn’t do that until I had a simpler version of the model working.

There’s probably better ways to do the linear algebra than rows-dot-product along with a replication of a bunch of rows. But again, I wouldn’t worry too much about efficiency until you have a simpler model fitting well and then this model fitting well.

Thank you so much Bob for your explanation