Involving a distance function: "error occurred during calling the sampler; sampling not done"

Hey everyone––I’m having trouble fitting a Stan model using RStan. I feel like the issue is with a function called “weight” that I created in STAN that involves the Euclidian distance. Function that is applied to a random matrix ZZ generated from bivariate normal (function “weight” computes the Euclidian distance of ZZ rows and some other computation).

//Code of the model in  STAN called  "stancode"

 matrix weight(matrix C, int nrows) {
    matrix[nrows, nrows] dd;
    matrix[nrows, nrows] W;
    vector[nrows] Wf;
    for(i in 1:nrows){
      for(j in 1:nrows){
        dd[i,j]=distance (to_vector (C[i,]), to_vector (C[j,])); // Euclidian distance
        dd[i,j]=inv(dd[i,j]);// invers
      dd[i,i] = 0;
      for (k in 1:nrows){
  Wf[k] = sum(dd[k,]);
for (l in 1:nrows){
    return W;


    int<lower=1> n; // number of observations 
     vector[n] y; // dep_data
     int x_p;       // nr of covariates 
     matrix[n,x_p] x; //covariates
    int<lower=1> J; // dimension of observations
    vector[J] Zero; // a vector of Zeros (fixed means of observations)
    corr_matrix[J] I;
parameters {
  vector<lower=0>[J] tau;
    real<lower=0> sigmasq;
    matrix[n,J] ZZ; // latent-observations
     vector[x_p+1]  beta;//  model parameter

transformed parameters{
  cov_matrix[J] Tau; 
  Tau =quad_form_diag(I, tau);


    vector[n] mu;  
    matrix[n,n] W; // prepare data
    W = weight(ZZ, n);//involve weight function 
    vector [n] E;
    E=W *x[,2];
    matrix [n,3] DD;
    DD=append_col(x,E); // prepare data
    mu = DD*beta;  // model

  for (i in 1:n){
       ZZ[i,] ~ multi_normal(Zero, Tau); // generate ZZ
     sigmasq ~ student_t(4,0,5)T[0, ]; //prior variance 
     tau ~  student_t(4,0,5)T[0, ]; ///prior variance 
      for (r in 1:x_p){
  beta[r] ~ normal(0, 10); // prior regression parameters
    for(i in 1:n) y[i] ~ normal(mu[i],sigmasq); //run model



x_p<- ncol(x)

fit ← stan(file=‘stancode.stan’,
data = list(n=n, n1=n1, x_p=x_p, y=y,x=x, J=J,Zero=rep(0, J),I=I), chains = 2, iter = 3000, warmup = 1000)

#Here’s what the error code reads:

Chain 1: Rejecting initial value:
Chain 1: Gradient evaluated at the initial value is not finite.
Chain 1: Stan can’t start sampling from this initial value.
Chain 1:
Chain 1: Initialization between (-2, 2) failed after 100 attempts.
Chain 1: Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] “Error : Initialization failed.”
[1] “error occurred during calling the sampler; sampling not done”

Also, I assigned some initial values but still got the same error.


Stan’s telling you where it captures an error:

Gradient evaluated at the initial value is not finite.

My guess is that some of the functions are going wrong here. You a use Stan’s print to print out the weight function you generate and E to make sure they’r doing what you want.

The top of the model block should be rewritten as:

model {
    vector[n] mu;  
    matrix[n,n] W = weight(ZZ, n); 
    vector [n] E = W * x[ , 2];
    matrix[n, 3] DD = append_col(x,E);
    vector[n] mu = DD * beta;

This can also go on one line, and it’d be nice to get rid of the rest of the stray vertical space so I can see the whole program at once.

If I is just the identity matrix, you don’t need to create a multinormal distribution—you can just use a normal.

Also, with a multi normal distribution, you don’t want to do it in a loop as it has to solve the covariance matrix each iteration. Instead, make ZZ an array of vectors and call ZZ ~ multi_normal(Zero, Tau). And then Cholesky factor the correlation matrix I and scale that one sided and use multi_normal_cholesky. There’s an example in the regression chapter of the User’s Guide.

You may be having problems because of the truncated Student-t. That is trying to autodiff through the Student-t cdf. The thing is, you don’t need the T[0, ] here because the arguments to student_t are all data and we don’t need the normalization constants for MCMC or optimization or VI.

You also want to vectorize, so that’d be just

beta ~ normal(0, 10);
y ~ normal(mu, sigmasq);

But keep in mind that Stan uses a location/scale parameterization of normal, so that should probably be sigma rather than sigmasq there. What you document as “model” is the likelihood (or what a Bayesian might call the “sampling distribution”). The model consists of both the prior and the likelihood.

1 Like

Thank you, @Bob_Carpenter, for all the advice. Apparently, vectorized forms of parameters or variables are important when working in STAN. Is this for the runtime of the model? Do you have any advice for me that would help to speed up the running time?