A matrix of parameters in Stan Integral

The below Stan code works.

functions{
  real cumulative_exposure(real x,       // Function argument
                           real xc,      // Complement of function argument
                                         // on the domain (defined later)
                           real[] theta, // parameters
                           real[] x_r,   // data (real)
                           int[] x_i) { // data (integer)
                      
    real Beta = theta[1];
    real eta = theta[2];                  
    return exp(Beta*(eta + log(x)));
    }
}
data {
  int<lower=1> n; // Sample size
  int<lower=1> n_cens; // Right censored sample size
  int<lower=1> n_obs; // Failed unit sample size
  int<lower=1> N; // Total number of times considered (all times for all units)
  real<lower=0> tt[N]; //All times for all units
  real y_tt[N]; //All of the use-rates
  int w_ind[N]; //Indices for w
  real<lower=0> tt_fail[n_obs]; //Failure times for units that failed
  int w_ind_fail[n_obs]; //Indices for w for units that failed
  real<lower=0> tt_cens[n_cens];
}
transformed data {
  real x_r[0];
  int x_i[0];
}
parameters {
  real <lower = 0> mu0;
  real <lower = 0> sigma0;
  real <lower = 0> sig;
  real <lower = 0> sig1;
  real <lower = 0> sig2;
  real <lower = -1, upper = 1>rho;
  real eta;
  real Beta;
  matrix[2, n] w;
}
model {
  //cumulative exposure for failed units and for censored units
  vector[n_obs] u_obs;
  vector[n_cens] u_cens;
  //Defining covariance matrix, cholesky decomposition and mean and variance of mixed effects model
  matrix[2,2] Sigma;
  matrix[2,2] L;
  real Mu[N];
  vector[n_obs] Mu_DFD;
  matrix[2, n] w2;
  //Covariance matrix
  Sigma[1,1] = sig1^2;
  Sigma[1,2] = rho*sig1*sig2;
  Sigma[2,1] = rho*sig1*sig2;
  Sigma[2,2] = sig2^2;
  //Cholesky decomposition
  L = cholesky_decompose(Sigma);
  w2 = L*w;
  //Parameters used in mixed effects model
  for(i in 1:N){
    Mu[i] = eta + w2[1,w_ind[i]] + w2[2,w_ind[i]]*log(tt[i]);
  }
  //Calculating mu_DFD 
  for(i in 1:n_obs){
    Mu_DFD[i] = eta + w2[1,w_ind_fail[i]] + w2[2,w_ind_fail[i]]*log(tt_fail[i]);
    u_obs[i] = integrate_1d(cumulative_exposure, 0, tt_fail[i], {Beta, eta}, x_r, x_i, 1e-8);
  }
  for(i in 1:n_cens){
    u_cens[i] = integrate_1d(cumulative_exposure, 0, tt_cens[i], {Beta, eta}, x_r, x_i, 1e-8);
  }
  //Likelihood
    target += weibull_lpdf(u_obs| 1/sigma0, exp(mu0));
    target += Beta*Mu_DFD;
    target += weibull_lccdf(u_cens|1/sigma0, exp(mu0));
    target += normal_lpdf(y_tt|Mu, sig);
    // Prior:
    Beta ~ normal(0, 10);
    mu0 ~ normal(0, 10);
    sigma0 ~ normal(0, 10);
    eta ~ normal(0, 10);
    to_vector(w) ~ normal(0, 1);
    sig ~ normal(0, 10);
    sig1 ~ normal(0, 10);
    sig2 ~ normal(0, 10);
    rho ~ normal(0, 10);
}

But, I would like to change the return in the cumulative_exposure function to be
return exp(Beta*(eta + w2[1,w_ind_fail[i]] + w2[2,w_ind_fail[i]]*log(x)));, c.f, the definition of Mu[i] in the model block.

I attempted a reparameterization of my model (which I would like to avoid because I am actually doing the opposite of what is suggested, i.e. I am going from a non-centered parameterization to a centered parameterization) in order to be able to include the w’s as parameters in the cumulative_exposure function.

I tried the following Stan code

functions{
  real cumulative_exposure(real x,       // Function argument
                           real xc,      // Complement of function argument
                                         // on the domain (defined later)
                           real[] theta, // parameters
                           real[] x_r,   // data (real)
                           int[] x_i) { // data (integer)
                      
    real Beta = theta[1];
    real eta = theta[2];
    real w2_1 = theta[3];
    real w2_2 = theta[4];
    
    return exp(Beta*(eta + w2_1 + w2_2*log(x)));
    }
}
data {
  int<lower=1> n; // Sample size
  int<lower=1> n_cens; // Right censored sample size
  int<lower=1> n_obs; // Failed unit sample size
  int<lower=1> N; // Total number of times considered (all times for all units)
  real<lower=0> tt[N]; //All times for all units
  real y_tt[N]; //All of the use-rates
  int w_ind[N]; //Indices for w
  real<lower=0> tt_fail[n_obs]; //Failure times for units that failed
  int w_ind_fail[n_obs]; //Indices for w for units that failed
  int w_ind_cens[n_cens];
  real<lower=0> tt_cens[n_cens];
  vector[2] w_M;
}
transformed data {
  real x_r[0];
  //int x_i[0];
}
parameters {
  real <lower = 0> mu0;
  real <lower = 0> sigma0;
  real <lower = 0> sig;
  real <lower = 0> sig1;
  real <lower = 0> sig2;
  real <lower = -1, upper = 1>rho;
  real eta;
  real Beta;
  vector[2] w2[n];
}
model {
  //cumulative exposure for failed units and for censored units
  vector[n_obs] u_obs;
  vector[n_cens] u_cens;
  //Defining covariance matrix, cholesky decomposition and mean and variance of mixed effects model
  matrix[2,2] Sigma;
  real Mu[N];
  vector[n_obs] Mu_DFD;
  //Covariance matrix
  Sigma[1,1] = sig1^2;
  Sigma[1,2] = rho*sig1*sig2;
  Sigma[2,1] = rho*sig1*sig2;
  Sigma[2,2] = sig2^2;
  //Parameters used in mixed effects model
  for(i in 1:N){
    Mu[i] = eta + w2[w_ind[i]][1] + w2[w_ind[i]][2]*log(tt[i]);
  }
  //Calculating mu_DFD 
  for(i in 1:n_obs){
    Mu_DFD[i] = eta + w2[w_ind_fail[i]][1] + w2[w_ind_fail[i]][2]*log(tt_fail[i]);
    u_obs[i] = integrate_1d(cumulative_exposure, 0, tt_fail[i],
    {Beta, eta, w2[w_ind_fail[i]][1], w2[w_ind_fail[i]][2]},
    x_r, w_ind_fail[i], 1e-8);
  }
  for(i in 1:n_cens){
    u_cens[i] = integrate_1d(cumulative_exposure, 0, tt_cens[i],
    {Beta, eta, w2[w_ind_cens[i]][1], w2[w_ind_cens[i]][2]},
    x_r, w_ind_cens[i], 1e-8);
  }
  //Likelihood
    target += weibull_lpdf(u_obs| 1/sigma0, exp(mu0));
    target += Beta*Mu_DFD;
    target += weibull_lccdf(u_cens|1/sigma0, exp(mu0));
    target += normal_lpdf(y_tt|Mu, sig);
    // Prior:
    Beta ~ normal(0, 10);
    mu0 ~ normal(0, 10);
    sigma0 ~ normal(0, 10);
    eta ~ normal(0, 10);
    w2 ~ multi_normal(w_M, Sigma);
    sig ~ normal(0, 10);
    sig1 ~ normal(0, 10);
    sig2 ~ normal(0, 10);
    rho ~ normal(0, 10);
}

The idea being that w2_1 and w2_2 in the cumulative_exposure function will be inputted into the function as w2[1,w_ind_fail[i]] and w2[2,w_ind_fail[i]] (see for loops in model code).

From what I can diagnose, the issue seems to be that w_ind_fail[i] and w_ind_cens[i] are int and not int[]. I am not too sure what to do now, as I think I am overcomplicating this more than enough already. I would also like to stay with the non-centered parameterization (but in that code w2 are not parameters, although I could put w2 in the transformed parameter block. But, this would mean I have to put Sigma in the transformed parameter block, too; and I have found that having a covariance matrix in the transformed parameter block increases the run time drastically.)

Error message:

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
sixth argument to integrate_1d, the integer data, must have type int[]; found type = int.

To help clarify, I have a group of units, and I would like to calculate the integral

\int^{T_i}_0 \exp\{\beta \times [\eta + w_{0i} + w_{1i}\log(t)]\}

for each unit. Here, tt_fail and tt_obs are vectors containing the upper limits of the integrals, and w_ind_fail and w_ind_cens are vectors that subset the w2 matrix (or array depending on which version of the code) for each unit.

Does anyone know a simpler way that I can achieve this goal?

You can turn an int into an int by wrapping it with curly braces.

So { 1 } is an int vs. just 1 is an int. (https://mc-stan.org/docs/2_21/reference-manual/vector-matrix-and-array-expressions.html – see Array expressions here – not a hugely talked about feature)

Is that integral with respect to t? If so I think you can just do it in closed form:

\int_0^{T} e^{\beta [\eta + w_0 + w_1 \log(t) ]}dt \\ e^{\beta [\eta + w_0]} \int_0^{T} e^{\beta w_1 \log(t)}dt \\ e^{\beta [\eta + w_0]} \int_0^{T} e^{\log(t^{\beta w_1})}dt \\ e^{\beta [\eta + w_0]} \int_0^{T} t^{\beta w_1}dt \\ e^{\beta [\eta + w_0]} \frac{1}{\beta w_1 + 1} t^{\beta w_1 + 1} \bigg\rvert ^{t = T}_{t = 0} \\

So I guess as long as \beta w_1 > -1 this exists?

1 Like

I will try this. I was about to ask about this issue because I tried writing

int[1] w_ind_fail[n_obs];
int[1] w_ind_cens[n_cens];

and this did not work.

Yes! It is with respect to time. I will try writing it analytically.

Out of curiosity, suppose the integral in question could not be wrote down analytically; and also suppose one of the parameters you are interested in is in the form of a matrix. Would I have to put w2 in the transformed parameter block, and treat the parameters separately, i.e. inputting w2[i][1] and w2[i][2] in the integrate_1d function? Also using the curly braces to enter an int[] rather than an int.

To get it into the integrate_1d function you would have to flatten it and append it to whatever other parameters you’re passing in through the theta array. It’s really annoying.

Something like this:

{ w[1, 1], w[1, 2], w[2, 1], w[2, 2] }

That’d be an array that contains the elements of w. You could reconstruct w inside cumulative_exposure with something like:

matrix[2, 2] w = [ [theta[1], theta[2]],
                   [theta[3], theta[4]] ];
1 Like

I am expecting some integrals to be negative, and so the only concern for me would be when \beta w_1 = 0.

If my understanding of Stan is correct, if I have n iterations after warm-up, I will have n samples of my parameter vectors, and it will be unlikely that \beta w_1 is zero for any of these samples and so it should be fine to calculate the integral (even though it has a singularity)? Although some small values of \beta w_1 \approx 0 could be an issue.

Update: The suggestion of using {} around the integers worked.

functions{
  real cumulative_exposure(real x,       // Function argument
                           real xc,      // Complement of function argument
                                         // on the domain (defined later)
                           real[] theta, // parameters
                           real[] x_r,   // data (real)
                           int[] x_i) { // data (integer)
                      
    real Beta = theta[1];
    real eta = theta[2];
    real w2_1 = theta[3];
    real w2_2 = theta[4];
    
    return exp(Beta*(eta + w2_1 + w2_2*log(x)));
    }
}
data {
  int<lower=1> n; // Sample size
  int<lower=1> n_cens; // Right censored sample size
  int<lower=1> n_obs; // Failed unit sample size
  int<lower=1> N; // Total number of times considered (all times for all units)
  real<lower=0> tt[N]; //All times for all units
  real y_tt[N]; //All of the use-rates
  int w_ind[N]; //Indices for w
  real<lower=0> tt_fail[n_obs]; //Failure times for units that failed
  int w_ind_fail[n_obs]; //Indices for w for units that failed
  int w_ind_cens[n_cens];
  real<lower=0> tt_cens[n_cens];
  vector[2] w_M;
}
transformed data {
  real x_r[0];
}
parameters {
  real <lower = 0> mu0;
  real <lower = 0> sigma0;
  real <lower = 0> sig;
  real <lower = 0> sig1;
  real <lower = 0> sig2;
  real <lower = -1, upper = 1>rho;
  real eta;
  real Beta;
  vector[2] w2[n];
}
model {
  //cumulative exposure for failed units and for censored units
  vector[n_obs] u_obs;
  vector[n_cens] u_cens;
  //Defining covariance matrix, cholesky decomposition and mean and variance of mixed effects model
  matrix[2,2] Sigma;
  real Mu[N];
  vector[n_obs] Mu_DFD;
  //Covariance matrix
  Sigma[1,1] = sig1^2;
  Sigma[1,2] = rho*sig1*sig2;
  Sigma[2,1] = rho*sig1*sig2;
  Sigma[2,2] = sig2^2;
  //Parameters used in mixed effects model
  for(i in 1:N){
    Mu[i] = eta + w2[w_ind[i]][1] + w2[w_ind[i]][2]*log(tt[i]);
  }
  //Calculating mu_DFD 
  for(i in 1:n_obs){
    Mu_DFD[i] = eta + w2[w_ind_fail[i]][1] + w2[w_ind_fail[i]][2]*log(tt_fail[i]);
    u_obs[i] = integrate_1d(cumulative_exposure, 0, tt_fail[i],
    {Beta, eta, w2[w_ind_fail[i]][1], w2[w_ind_fail[i]][2]},
    x_r, {w_ind_fail[i]}, 1e-8);
  }
  for(i in 1:n_cens){
    u_cens[i] = integrate_1d(cumulative_exposure, 0, tt_cens[i],
    {Beta, eta, w2[w_ind_cens[i]][1], w2[w_ind_cens[i]][2]},
    x_r, {w_ind_cens[i]}, 1e-8);
  }
  //Likelihood
    target += weibull_lpdf(u_obs| 1/sigma0, exp(mu0));
    target += Beta*Mu_DFD;
    target += weibull_lccdf(u_cens|1/sigma0, exp(mu0));
    target += normal_lpdf(y_tt|Mu, sig);
    // Prior:
    Beta ~ normal(0, 10);
    mu0 ~ normal(0, 10);
    sigma0 ~ normal(0, 10);
    eta ~ normal(0, 10);
    w2 ~ multi_normal(w_M, Sigma);
    sig ~ normal(0, 10);
    sig1 ~ normal(0, 10);
    sig2 ~ normal(0, 10);
    rho ~ normal(0, 10);
}

But, I receive the following error messages:

SAMPLING FOR MODEL 'joint_model_int' NOW (CHAIN 1).
Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: integrate: error estimate of integral 174455 exceeds the given relative tolerance times norm of integral  (in 'model320832e7666f_joint_model_int' at line 83)

Chain 1: Unrecoverable error evaluating the log probability at the initial value.
Chain 1: Exception: Error in function tanh_sinh<double>::integrate: The tanh_sinh quadrature evaluated your function at a singular point and got inf. Please narrow the bounds of integration or check your function for singularities.  (in 'model320832e7666f_joint_model_int' at line 83)

[1] "Error in sampler$call_sampler(args_list[[i]]) : "                                                                                                                                                                                                                                     
[2] "  Exception: Error in function tanh_sinh<double>::integrate: The tanh_sinh quadrature evaluated your function at a singular point and got inf. Please narrow the bounds of integration or check your function for singularities.  (in 'model320832e7666f_joint_model_int' at line 83)"
[3] "In addition: Warning message:"                                                                                                                                                                                                                                                        
[4] "In readLines(file, warn = TRUE) :"                                                                                                                                                                                                                                                    

This error:

The tanh_sinh quadrature evaluated your function at a singular point and got inf. Please narrow the bounds of integration or check your function for singularities.

Is the one to pay attention to. That’s just the integral failing to compute. Could be cause it doesn’t exist or it could be a numeric issue.

Either way probly the thing to do is figure out exactly the input values to that specific integral that is failing.

You can do this by just printing all the values you’re passing to integrate_1d and then looking at the output. Values printed immediately before an error led to that error presumably.

1 Like

The problem at the moment is that the script doesn’t get to the point of sampling. I tried adding print commands to the model block, but nothing is happening but the same error.

The code runs if I restrict the range of beta and the w’s. For example,

real<lower = 1> Beta;
vector<lower = 1>[2] w2[n];

The singularity occurs when \beta \times w_1 = -1. So perhaps the singularity in the integral is being evaluated immediately when I do not constrain the values.

If I restrict only one of the parameters, then I receive the mentioned error message. If I restrict w2 and not Beta, but start the initial value of Beta at 15, then the code seems to run fine. But, if I start the initial value of Beta at 1.5 (the true value of Beta) I receive the mentioned error message.

1 Like

Oh right, sorry, I think Stan might just eat all the print statements before sampling right now.

I’d assume it’s when \beta w_2 < -1. You can see that you can’t evaluate the closed form integral at t = 0 in that case. Does \beta w_2 < -1 have any significance in your model?

1 Like

Ah yes, I just noticed. Maybe I could change the lower limit to something slightly bigger than zero, or perhaps a change of variables could help here.

I do not think so. I think it is safe to say there is no significance.

Then I guess we gotta think in terms of that integral. What’s it predicting? Why is it e^something? Why is it log(t) and not t? What are the units on w_0 and w_1, etc.

I tried changing the lower integral and that allows the code to run fine. If this integral is a good approximation of the actual integral (with lower limit equal to zero) then this could be a solution.

If I am not mistaken, the approximation error will be proportional to b^{\beta w_1 + 1}, where b is the new lower limit.

The model is

X_i(t_{ij}) = \eta + Z_i(t_{ij})w_i + \epsilon_{ij},
where \eta is the mean, Z_i(t_{ij}) = [1, \log(t_{ij})], w_i = (w_{0i}, w_{1i})' \sim N(0,\Sigma_{w}), \epsilon_{ij} \sim N(0, \sigma^2), and

\Sigma_{w} = \begin{pmatrix} \sigma^2_1& \rho\sigma_1\sigma_2 \\ \rho\sigma_1\sigma_2 & \sigma^2_2 \end{pmatrix}.

so the w’s are just random effects. The relationship between X_i and t_{ij} is exponential, and so \log(t) is used.

The other part of the model is

u = \int^{T_i}_0 \exp\{\beta \times (\eta + Z_i(t_{ij})w_i)\}. The larger the value of u, the more likely the units are to fail. \beta is an acceleration coefficient which controls how quickly grows.

No it won’t be. The issue is the integral goes to infinity if you include that little bit.

This is some sorta hazard model? I don’t think it’s avoidable. I think in this case we’ve got to figure out what \beta w_1 < -1 means. If the u blows up, does that mean failure is guaranteed? Is that what \beta w_1 < -1 implies?

1 Like

Yes, it is.

Yes. u is assumed to follow a Weibull distribution. Once u is large enough the probability of failure will be 1.

I am not sure sure. I will think about this. But, from a first thought, I would have thought not.

In the problem I am considering, if w_1 = -2 or something, and \beta = 1.5 (the true value of \beta = 1.5), this would be a singularity. But, a negative w_1 means that the unit is not being used as often over time, so this unit should be less likely to fail than a unit with a higher value of w_1.

How does Stan perform integration?

For the integral:

\int^T_0 t^{\beta w_1}dt

If I choose n iterations, will it evaluate the integral for each pair (\beta, w_1) and then return the mean of the n integrals?

Also, how does Stan actually evaluate the integrand?

The reason I ask is because I am curious to whether Stan ever evaluates the integrand at t=0.

For the integral

\int^T_0 t^{\beta w_1}dt

I enter the lower and upper limits, but I have n samples of \beta and w_1. Does Stan sample t from \text{U}(0,T), and hence have n vectors (\beta, w_1, t), and then perform Monte Carlo integration?

So Stan (although it is saying 1D integration) is actually evaluating

\int_{W}\int_B\int^T_0 t^{\beta w_1}dtd\beta dw_1,

where B and W are the parameter spaces for \beta and W, respectively.

If so, then the probability that t=0 is drawn should be zero and the singularities should not be an issue. But, I am not sure what Stan is doing behind the scenes.

Update: Stan appears to be able to take transformed parameters (from the transformed parameter block and from the model block) as inputs to integrate_1d. This allowed me to use a non-centered parameterization.

Warning messages:
2: There were 184 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup 
3: There were 66 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded 
4: Examine the pairs() plot to diagnose sampling problems
 
5: The largest R-hat is 11.14, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat 
6: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess 
7: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess 

I seem to have received every warning message that you can expect from Stan.

functions{
  real cumulative_exposure(real x,       // Function argument
                           real xc,      // Complement of function argument
                                         // on the domain (defined later)
                           real[] theta, // parameters
                           real[] x_r,   // data (real)
                           int[] x_i) { // data (integer)
                      
    real Beta = theta[1];
    real eta = theta[2];
    real w2_1 = theta[3];
    real w2_2 = theta[4];
    
    return exp(Beta*(eta + w2_1 + w2_2*log(x)));
    }
}
data {
  int<lower=1> n; // Sample size
  int<lower=1> n_cens; // Right censored sample size
  int<lower=1> n_obs; // Failed unit sample size
  int<lower=1> N; // Total number of times considered (all times for all units)
  real<lower=0> tt[N]; //All times for all units
  real y_tt[N]; //All of the use-rates
  int w_ind[N]; //Indices for w
  real<lower=0> tt_fail[n_obs]; //Failure times for units that failed
  int w_ind_fail[n_obs]; //Indices for w for units that failed
  int w_ind_cens[n_cens];
  real<lower=0> tt_cens[n_cens];
}
transformed data {
  real x_r[0];
}
parameters {
  real <lower = 0> mu0;
  real <lower = 0> sigma0;
  real <lower = 0> sig;
  real <lower = 0> sig1;
  real <lower = 0> sig2;
  real <lower = -1, upper = 1>rho;
  real eta;
  real Beta;
  matrix[2, n] w;
}
model {
  //cumulative exposure for failed units and for censored units
  vector[n_obs] u_obs;
  vector[n_cens] u_cens;
  //Defining covariance matrix, cholesky decomposition and mean and variance of mixed effects model
  matrix[2,2] Sigma;
  real Mu[N];
  vector[n_obs] Mu_DFD;
  matrix[2,2] L;
  matrix[2, n] w2;
  //Covariance matrix
  Sigma[1,1] = sig1^2;
  Sigma[1,2] = rho*sig1*sig2;
  Sigma[2,1] = rho*sig1*sig2;
  Sigma[2,2] = sig2^2;
  //Cholesky decomposition
  L = cholesky_decompose(Sigma);
  w2 = L*w;
  //Parameters used in mixed effects model
  for(i in 1:N){
    Mu[i] = eta + w2[1,w_ind[i]] + w2[2,w_ind[i]]*log(tt[i]);
  }
  //Calculating mu_DFD 
  for(i in 1:n_obs){
    Mu_DFD[i] = eta + w2[1,w_ind_fail[i]] + w2[2,w_ind_fail[i]]*log(tt_fail[i]);
    u_obs[i] = integrate_1d(cumulative_exposure, 0.01, tt_fail[i],
    {Beta, eta, w2[1,w_ind_fail[i]], w2[2,w_ind_fail[i]]},
    x_r, {w_ind_fail[i]}, 1e-8);
  }
  for(i in 1:n_cens){
    u_cens[i] = integrate_1d(cumulative_exposure, 0.01, tt_cens[i],
    {Beta, eta, w2[1,w_ind_cens[i]], w2[2,w_ind_cens[i]]},
    x_r, {w_ind_cens[i]}, 1e-8);
  }
  //Likelihood
    target += weibull_lpdf(u_obs| 1/sigma0, exp(mu0));
    target += Beta*Mu_DFD;
    target += weibull_lccdf(u_cens|1/sigma0, exp(mu0));
    target += normal_lpdf(y_tt|Mu, sig);
    // Prior:
    Beta ~ normal(0, 10);
    mu0 ~ normal(0, 10);
    sigma0 ~ normal(0, 10);
    eta ~ normal(0, 10);
    to_vector(w) ~ normal(0, 1);
    sig ~ normal(0, 10);
    sig1 ~ normal(0, 10);
    sig2 ~ normal(0, 10);
    rho ~ normal(0, 10);
}

The code took over 9 hours to run with 500 iterations. I have a feeling increasing adapt_delta and max_treedepth, will still produce the same errors (but with a smaller number of divergent transitions and a smaller number of transitions exceeding the maximum treedepth).

By changing the lower limit from 0 to 0.01, it allows the Stan code to run, and the error in doing this is:

\frac{e^{\beta[\eta + w_0]}}{\beta w_1 + 1}0.01^{\beta w_1 + 1},

which is very small for the true parameter values.

There is something in the Stan user guide about using xc to avoid precision loss in definite integrals. I could perhaps use this when t gets close to the lower limit (0.01).

I could also approximate the integral in Stan by using the trapezium rule (or some other technique). I tried this and had to stop my code because it was taking too long.

Those are my ideas. But, I am curious to know what could be causing all of the warning messages. It seems that Stan is not affected by the singularity (there is only one singularity now since I changed the lower limit) anymore; because Stan would have stopped itself if it was. For this reason I do not think reformulating the statistical model (so that the integral has no singularities) would help. I cannot reparameterize the Stan code anymore, either.

Haha, it seems you pretty much did.

Why use the integrate_1d functionality if we know the integral in closed form?

I am still skeptical that removing the singularity in this way is a good idea. What does it mean to cut off [0, 0.01]? I think the argument here is the integral on [0.01, limit] is going to be a good approximation to [0.0, limit] which is not true because the integral [0.0, 0.01] is infinite. [0.0, 0.01] is where most of the stuff is, so cutting it off is a bad approximation.

But you’re right that we really don’t want this integral to go to infinity cause that implies there is no uncertainty in our outcome and (I believe) we don’t really believe that! So somehow there is a problem with the model and we’re using this integral limit as a way to work around it? We’re somehow putting a limit on the hazard function? I’d rather constrain the parameters but if that isn’t possible then I guess we gotta do this.

1 Like

I have just attempted to do this. I find the following error:

Chain 1: Rejecting initial value:
Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
Chain 1:   Stan can't start sampling from this initial value.
Chain 1: 
Chain 1: Initialization between (-2, 2) failed after 100 attempts. 
Chain 1:  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.

I have tried setting the initial values to the true values, but the same errors are shown.

The issue is that some units that fail very quickly have very high values of u. This means that \text{Pr}(\text{Unit Fails}) \rightarrow 1, but some of these u values have close to zero density, and hence their log-density tends to minus infinity.

I can understand why Stan does not like this, but, a large u value is possible, meaning that the probability of failure is certain.

Is there some technique to get around this?