# Involving a distance function: "error occurred during calling the sampler; sampling not done"

Hey everyone––I’m having trouble fitting a Stan model using RStan. I feel like the issue is with a function called “weight” that I created in STAN that involves the Euclidian distance. Function that is applied to a random matrix ZZ generated from bivariate normal (function “weight” computes the Euclidian distance of ZZ rows and some other computation).

//Code of the model in  STAN called  "stancode"

functions{
//weight
matrix weight(matrix C, int nrows) {
matrix[nrows, nrows] dd;
matrix[nrows, nrows] W;
vector[nrows] Wf;
for(i in 1:nrows){
for(j in 1:nrows){
dd[i,j]=distance (to_vector (C[i,]), to_vector (C[j,])); // Euclidian distance
dd[i,j]=inv(dd[i,j]);// invers
}
dd[i,i] = 0;
for (k in 1:nrows){
Wf[k] = sum(dd[k,]);
}
for (l in 1:nrows){
W[l,]=dd[l,]/Wf[l];
}
}
return W;
}

}

data{
int<lower=1> n; // number of observations
vector[n] y; // dep_data
int x_p;       // nr of covariates
matrix[n,x_p] x; //covariates
int<lower=1> J; // dimension of observations
vector[J] Zero; // a vector of Zeros (fixed means of observations)
corr_matrix[J] I;
}

parameters {
vector<lower=0>[J] tau;
real<lower=0> sigmasq;
matrix[n,J] ZZ; // latent-observations
vector[x_p+1]  beta;//  model parameter
}

transformed parameters{
cov_matrix[J] Tau;

}

model{
vector[n] mu;
matrix[n,n] W; // prepare data
W = weight(ZZ, n);//involve weight function
vector [n] E;
E=W *x[,2];
matrix [n,3] DD;
DD=append_col(x,E); // prepare data
mu = DD*beta;  // model

for (i in 1:n){
ZZ[i,] ~ multi_normal(Zero, Tau); // generate ZZ
}

#prior
sigmasq ~ student_t(4,0,5)T[0, ]; //prior variance
tau ~  student_t(4,0,5)T[0, ]; ///prior variance

for (r in 1:x_p){
beta[r] ~ normal(0, 10); // prior regression parameters
}

#model
for(i in 1:n) y[i] ~ normal(mu[i],sigmasq); //run model

}

# data

J<-2
I<-diag(2)

y<-vector[n,1]
x<-matrix[n,2]
x_p<- ncol(x)

fit ← stan(file=‘stancode.stan’,
data = list(n=n, n1=n1, x_p=x_p, y=y,x=x, J=J,Zero=rep(0, J),I=I), chains = 2, iter = 3000, warmup = 1000)

#Here’s what the error code reads:

Chain 1: Rejecting initial value:
Chain 1: Gradient evaluated at the initial value is not finite.
Chain 1: Stan can’t start sampling from this initial value.
Chain 1:
Chain 1: Initialization between (-2, 2) failed after 100 attempts.
Chain 1: Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] “Error : Initialization failed.”
[1] “error occurred during calling the sampler; sampling not done”

# Also, I assigned some initial values but still got the same error.

Thanks

Stan’s telling you where it captures an error:

Gradient evaluated at the initial value is not finite.

My guess is that some of the functions are going wrong here. You a use Stan’s print to print out the weight function you generate and E to make sure they’r doing what you want.

The top of the model block should be rewritten as:

model {
vector[n] mu;
matrix[n,n] W = weight(ZZ, n);
vector [n] E = W * x[ , 2];
matrix[n, 3] DD = append_col(x,E);
vector[n] mu = DD * beta;

This can also go on one line, and it’d be nice to get rid of the rest of the stray vertical space so I can see the whole program at once.

If I is just the identity matrix, you don’t need to create a multinormal distribution—you can just use a normal.

Also, with a multi normal distribution, you don’t want to do it in a loop as it has to solve the covariance matrix each iteration. Instead, make ZZ an array of vectors and call ZZ ~ multi_normal(Zero, Tau). And then Cholesky factor the correlation matrix I and scale that one sided and use multi_normal_cholesky. There’s an example in the regression chapter of the User’s Guide.

You may be having problems because of the truncated Student-t. That is trying to autodiff through the Student-t cdf. The thing is, you don’t need the T[0, ] here because the arguments to student_t are all data and we don’t need the normalization constants for MCMC or optimization or VI.

You also want to vectorize, so that’d be just

beta ~ normal(0, 10);
y ~ normal(mu, sigmasq);

But keep in mind that Stan uses a location/scale parameterization of normal, so that should probably be sigma rather than sigmasq there. What you document as “model” is the likelihood (or what a Bayesian might call the “sampling distribution”). The model consists of both the prior and the likelihood.

1 Like

Thank you, @Bob_Carpenter, for all the advice. Apparently, vectorized forms of parameters or variables are important when working in STAN. Is this for the runtime of the model? Do you have any advice for me that would help to speed up the running time?