Hi I am working on a clustering problem, where one of the components in my model is a euclidean distance matrix.
In its simlest form, the STAN code chunk looks like this,
data {
int<lower=0> N; // Dimension of the data object
int d; // Dimension of Latent space
int Y[N,N]; // Input Sociomatrix
real<lower=0> z_prior_sd;
}
transformed data{
vector[d] zeros;
vector[d] ones;
vector[N] zeros2;
vector[N] ones2;
real gam=-1;
ones = rep_vector(1, d);
zeros = rep_vector(0, d);
ones2 = rep_vector(1, N);
zeros2 = rep_vector(0, N);
}
parameters {
real alpha;
vector[d] z[N];
}
model {
for(i in 1:N)
{
to_vector(z[i]) ~ normal( 0 , z_prior_sd);
}
for(i in 1:N)
{
for(j in 1:N)
{
Y[i,j] ~ bernoulli(Phi(alpha+gam*distance(to_vector(z[i]),to_vector(z[j]))));
}
}
alpha ~ normal(0,10);
}
When I run the sampler, it keeps on throwing this error of failed to initialize, gradient evaluated at initial value is not finite and stuff like that.
Now the code above is a very watered down version of the full model, and I have been able to idenitfy that the issue lies with the distance(z[i],z[j])
component. I am not exactly sure what the issue is and I was wondering if someone could help me with if there is something fundamentally wrong with how I am defining this component or is there something else?
The idea here is, some actors lying in a space can be attributed as neighbors based on their positions in an unobserved latent space (as represented by z_i). Here d=2 and the latent space is euclidean.
Any insight is appreciated. Thanks in advance.
Just a quick note that when you have arrays of vectors (i.e., vector[d] z[N]
), you don’t need to call to_vector
when you index them, as they already return a vector. For example:
Phi(alpha + gam * distance(z[i], z[j])))
As for the initialisation/gradient errors, it’s a bit hard to debug without data, but there’s a good chance that the issue is coming from the call to Phi
. The Phi
function is computationally a bit unstable, and will underflow with values smaller than -38, and overflow with values greater than 8. You can try using the Phi_approx
function instead which is an approximation which is a bit more robust.
Additionally, a good way to debug initialisation issues is to print
the input values:
for(i in 1:N) {
for(j in 1:N) {
real p = alpha + gam * distance(z[i], z[j]);
print(p);
real Phi_p = Phi(p);
print(Phi_p);
Y[i,j] ~ bernoulli(Phi_p);
}
}
This will give you an idea of what values are being passed to the functions here, and whether they’re what you’d expect
Thanks @andrjohns for the suggestions. I did print the values of every subsequent parameters just to check if they are what I would expect. They seem okay. Also, I tried the Phi_approx
function call but the same issue persists.
I have attached the pseudo data that I am working with.
Sociomatrix.csv (8.8 KB)
Here N=67, d=2,z_prior_sd=25
.
This is the error message that I get
Chain 1: Rejecting initial value:
Chain 1: Log probability evaluates to log(0), i.e. negative infinity.
Chain 1: Stan can't start sampling from this initial value.
Chain 1:
Chain 1: Initialization between (-2, 2) failed after 100 attempts.
Chain 1: Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] "Error in sampler$call_sampler(args_list[[i]]) : Initialization failed."
[1] "error occurred during calling the sampler; sampling not done"
I am just trying to understand if this error is at all a result of some faulty line of codes or something fundamentally wrong with the way this model has been defined.
Can you try with init = 0
? Sometimes the initial values can fall outside the range that Phi
/Phi_approx
is defined to work in, which causes the failure
Yeah that is what I did and it seems to have solved the issue. I also changed the model likelihood to bernoulli_logit
and used init=0
in the sampler call statement.
Thanks for your help with this.