Can anyone suggest what I could change with the below code to make it more stable.
Problem: I have n=1800 units and n_obs = 69 of them fail, and n_cens = 1731 are still working (and so are right-censored). I have some simulated data and have obtained estimates for the true values via maximum likelihood. I would now like to obtain these estimates by using a Bayesian approach.
I have tried many times with vague priors with no success. The error message I keep getting is x divergent transitions after warmup. I have now placed a very narrow prior on sig (around its true value), but I still get divergent transitions.
I have tried increasing adapt-delta, lowering stepsize, and increasing max_treedepth, but I still get divergent transitions.
rho, sig1, sig2, beta, eta, mu0, and sigma0 are all unknown parameters; however, I inputted them as data to try and find the source of the divergent transitions.
When running the code with rho, sig1, sig2, beta, eta, mu0, and sigma0 (and sig) treated as unknown parameters, my code took 4 days to run with “control = list(adapt_delta = 0.999, stepsize = 0.001, max_treedepth = 20” and 2000 iterations with one chain. I still had divergent transitions.
I have come to the conclusion (from reading similar posts) that my model must have something fundamentally wrong with it, but I just cannot see what it is. Could it be that I am defining my own matrix and not using cov_matrix or something? The matrix I am entering is positive definite so I doubt that is the case.
Perhaps the parameterization of the Weibull distribution that I am using is causing this to be unstable? But I have used this parameterization before and it has been fine (not in a hierarchical model).
I have 1800 units and 1-70 observations for each unit. For example, unit one has 35 observations, and unit 2 has 26. I enter the data in the following form, tt = 1,2,…,35, 1,2,…,26, w_ind = 1,1,1,…,1, 2,2,…,2 (35 1’s and 26 2’s), y_tt = y_1,1, y_1,2, y_1,3, …,y_1,35, y_2,1,… (I added this paragraph to show that the data was entered correctly and is not the issue).
data {
int<lower=1> n; // Sample size
int<lower=1> n_cens; // Right censored sample size
int<lower=1> n_obs; // Failed unit sample size
int<lower=1> N; // Total number of times considered (all times for all units)
real<lower=0> tt[N]; //All times for all units
real y_tt[N]; //All of the use-rates
int w_ind[N]; //Indices for w
vector[2] w_M; //Mean vector for w
vector[n_obs] x_DFD; //Use rate at failure-times for the units that failed
real <lower = 0> sig1;
real <lower = 0> sig2;
real <lower = -1, upper = 1>rho;
real <lower = 0> mu0;
real <lower = 0> sigma0;
real eta;
real Beta;
vector[n_obs] u_obs;
vector[n_cens] u_cens;
matrix[2,2] I2; //Covariance matrix for w
}
parameters {
real <lower = 0> sig;
vector[2] w[n]; // A length-n array of length 2 vectors
}
model {
//Defining covariance matrix, cholesky decomposition and mean and variance of mixed effects model
matrix[2,2] Sigma;
matrix[2,2] L;
real Mu[N];
real Sig[N];
vector[2] w2[n]; // A length-n array of length 2 vectors
//Covariance matrix
Sigma[1,1] = sig1^2;
Sigma[1,2] = rho*sig1*sig2;
Sigma[2,1] = rho*sig1*sig2;
Sigma[2,2] = sig2^2;
//Cholesky decomposition
L = cholesky_decompose(Sigma);
for(i in 1:n){
w2[i] = L*w[i];
}
//Parameters used in mixed effects model
for(i in 1:N){
Mu[i] = eta + w2[w_ind[i]][1] + w2[w_ind[i]][2]*log(tt[i]);
Sig[i] = sig1^2 + log(tt[i])*(2*rho*sig1*sig2 + sig2^2*log(tt[i])) + sig^2;
}
//Likelihood
target += weibull_lpdf(u_obs| 1/sigma0, exp(mu0));
target += Beta*x_DFD;
target += weibull_lccdf(u_cens|1/sigma0, exp(mu0));
target += normal_lpdf(y_tt|Mu, Sig);
// Prior:
w ~ multi_normal(w_M, I2);
sig ~ gamma(0.05, 1);
}
Diagnostic plots for when I ran the full model with all parameters (i.e. when I did not input rho, sig1, sig2, beta, eta, mu0, and sigma0). p.pdf (32.6 KB) p2.pdf (36.7 KB) p3.pdf (37.5 KB)