To implement the idea of filling the sigmas_e vector by flattening the covariance matrices (sigmas_t) in the Stan code, I follow this structure:
But I have this problem:
Error in stanc(filename, allow_undefined = TRUE) : 0
Semantic error in 'string', line 49, column 4 to column 76:
-------------------------------------------------
47:
48: // Flatten sigmas_t into sigmas_e
49: sigmas_e[cov_idx:(cov_idx + nst[t] * nst[t] - 1)] = to_vector(sigmas_t);
^
50: cov_idx += nst[t] * nst[t]; // Update cov_idx to the new position
51:
-------------------------------------------------
Cannot assign to global variable 'sigmas_e' declared in previous blocks.
Stan code:
data {
int<lower=0> nb; // Number of observations (non-missing)
int<lower=0> nt; // Number of time steps (years)
int<lower=0> np; // Number of predictors (tree-ring chronologies)
int<lower=0> ns; // Number of streamflow sites
int<lower=0> ncvm; // Size of the covariance matrix vector
matrix[nt, np] x; // Predictor matrix (time x predictors)
vector[nb] y; // Response vector (non-missing values)
int<lower=1, upper=ns> site[nb]; // Streamflow station indices
int<lower=1> nst[nt]; // Number of stations per year
int<lower=1, upper=nb> start_yrs[nt]; // Start year indices
int<lower=1, upper=nb> end_yrs[nt]; // End year indices
}
transformed data {
vector<lower=0>[nt] df_e; // Degrees of freedom for inverse Wishart
for (t in 1:nt) {
df_e[t] = nst[t] + 1; // Degrees of freedom for covariance matrices
}
}
parameters {
vector[ns] alphas; // Intercepts for each site
matrix[ns, np] betas; // Coefficients for predictors
vector[ncvm] sigmas_e; // Flattened covariance matrices
}
model {
// Priors
alphas ~ normal(0, 100);
to_vector(betas) ~ normal(0, 100);
// Initialize index for sigmas_e
int cov_idx = 1; // Start index for sigmas_e
// Loop over time steps (years)
for (t in 1:nt) {
int start_idx = start_yrs[t];
int end_idx = end_yrs[t];
int obs_count = end_idx - start_idx + 1;
vector[nst[t] * nst[t]] sigmas_v;
// Create covariance matrix for current year
matrix[nst[t], nst[t]] sigmas_t;
// Priors for the covariance matrices
sigmas_t ~ inv_wishart(df_e[t], diag_matrix(rep_vector(1.0, nst[t]))); // Year-specific covariance prior
// Flatten sigmas_t into sigmas_e
sigmas_e[cov_idx:(cov_idx + nst[t] * nst[t] - 1)] = to_vector(sigmas_t);
cov_idx += nst[t] * nst[t]; // Update cov_idx to the new position
// Cholesky decomposition
matrix[ns, ns] L_e = cholesky_decompose(sigmas_t);
// Build the mean vector for the current year
vector[obs_count] mu;
for (i in start_idx:end_idx) {
int s = site[i];
mu[i - start_idx + 1] = alphas[s] + dot_product(betas[s], x[t]);
}
// Multivariate normal likelihood for the current year
y[start_idx:end_idx] ~ multi_normal_cholesky(mu, L_e);
}
}
The issue is that you’re defining sigmas_e in the parameter block. In Stan, everything defined there is sampled and so cannot be overwritten. You can define it at the beginning of the model block instead to solve the issue.
1 Like
Thank you so much
Can you suggest how to solve this issue?
Yes, this is what @simonbrauer was recommending:
model {
vector[ncvm] sigmas_e;
for (t in 1:nt) {
sigmas_e[cov_idx:(cov_idx + nst[t] * nst[t] - 1)] = to_vector(sigmas_t);
}
...
1 Like
[edit: escaped code]
I correct the Stan code but not covregred more that 3:
data {
int<lower=0> nb; // Number of observations (non-missing)
int<lower=0> nt; // Number of time steps (years)
int<lower=0> np; // Number of predictors (tree-ring chronologies)
int<lower=0> ns; // Number of streamflow sites
matrix[nt, np] x; // Predictor matrix (time x predictors)
vector[nb] y; // Response vector (non-missing values)
matrix[ns, ns] Wp_e; // Prior covariance matrix
int<lower=1, upper=ns> site[nb]; // Streamflow station indices
int<lower=1> nst[nt]; // Number of stations per year
int<lower=1, upper=nb> start_yrs[nt]; // Start year indices
int<lower=1, upper=nb> end_yrs[nt]; // End year indices
}
transformed data {
real <lower=0> df_e = ns + 1;
}
parameters {
vector[ns] alphas; // Intercepts for each site
matrix[ns, np] betas; // Coefficients for predictors
cov_matrix [ns] sigmas_e; // Covariance matrix for streamflow sites
}
model {
// matrix [nt, ns] mu;
// Priors
alphas ~ normal(0, 100);
to_vector(betas) ~ normal(0, 10);
sigmas_e ~ inv_wishart(df_e, Wp_e);
// Loop over time steps (years)
for (t in 1:nt) {
int start_idx = start_yrs[t];
int end_idx = end_yrs[t];
int obs_count = end_idx - start_idx + 1;
int nst_s = nst[t];
// Extract the relevant subset of the covariance matrix for the current year
matrix[nst_s, nst_s] sigmas_t = sigmas_e[site[start_idx:end_idx], site[start_idx:end_idx]];
// Cholesky decomposition
matrix[nst_s, nst_s] L_e = cholesky_decompose(sigmas_t);
// Build the mean vector for the current year
vector[obs_count] mu = alphas[site[start_idx:end_idx]] + rows_dot_product(betas[site[start_idx:end_idx]], rep_matrix(x[t], nst_s));
//vector[obs_count] mu;
// for (i in start_idx:end_idx) {
// int s = site[i]; // Get the station index
// mu[i - start_idx + 1] = alphas[s] + dot_product(betas[s], x[t]); // Compute the mean for site s
// }
// Multivariate normal likelihood for the current year
y[start_idx:end_idx] ~ multi_normal_cholesky(mu, L_e);
}
}
The max(summary(fit)$summary[,“Rhat”]) > 3
Would you please give me your suggestion?
I’m afraid I don’t have time to really dive in and debug your code. There’s nothing obviously wrong about it at first glance that jumps out at me.
I would try two things, depending on what’s possible:
-
Start with the simplest possible working model (e.g., no hierarchical priors, no covariance modeling, etc.), get that fitting, then build up incrementally to a more complex model. That way, you will see where things go off the rails.
-
Look at the posterior and see which variables are failing. Are you sure with all of that indexing that you’re putting priors on all the continuous parameters? If not, HMC is going to fail. You can add print("variable A = ", A)
type statements to the code to inspect what’s happening as the log density is being built up.
Also, the way you’re coding the multivariate normal isn’t helpful. You can just use a regular multi_normal
, which will do the decompositions internally. There’s nothing to be gained doing it yourself. Also, it’s going to be hard to do anything efficiently if each observation gets its own covariance matrix. It can also be very hard to fit if they don’t share a lot of parameters—it winds up being very prior driven.
Unless you really believe in your wishers prior, I’d recommend coding with Cholesky factors and using a combination of LKJ priors on the correlations and independent priors on the scales (see the regression chapter of the User’s Guide for instructions). But again, I wouldn’t do that until I had a simpler version of the model working.
There’s probably better ways to do the linear algebra than rows-dot-product along with a replication of a bunch of rows. But again, I wouldn’t worry too much about efficiency until you have a simpler model fitting well and then this model fitting well.
Thank you so much Bob for your explanation