Fitting a Poisson Latent Space Model for networks

Hi! I’m trying to fit the Latent Space Model (LSM) for an undirected weighted network.

This is the original paper, discussing the model for unweighted networks https://sites.stat.washington.edu/raftery/Research/PDF/hoff2002.pdf

The model, basically, is y_ij ~ Poisson(alpha - distance_ij) i.e. each edge weight is modelled as a Poisson of parameter “some intercept minus the distance between the two extremities of the edge”. I use the squared distance because it’s faster.

The model is not identifiable for the positions in the latent space i.e. unvariant under rotation, translation, reflection. What I do (not included here) is do a Procrustes transform after on the positions. That part works fine.

In the generated quantities, I calculate distances between all positions. Those distances, however, should be identifiable.

My problem is: when the data is not too noisy, the distances are identifiable and that’s good. But when the data is noisy, there are problems in identifying even the distances.

Below is the stan model. Below the stan model are two examples with fake data. In the first example, the data is not too noisy. The distances (generated quantities) converge. In the second example, when the data is noisy, the chains do not converge.

In a third and final example, at the end, I use my real data and nothing converges.

data {
  int<lower=0> N; // n edges 
  int<lower=0> M; // n nodes
  int<lower=0> D; // D=2
  array[N] int<lower=0> y; // edge weight
  array[N] int<lower=1> from; // node i
  array[N] int<lower=1> to; // node j
}
parameters {
  real alpha; // intercept
  matrix[M, D] K; // positions
}
transformed parameters {
  array[N] real mu;
    for (n in 1:N){
      mu[n] = exp(alpha-squared_distance(K[from[n],],K[to[n],]));
  }
  
}
model {
  // priors
  alpha ~ normal(1,3);
  to_vector(K) ~ normal(0,3);
  // likelihood
  for (n in 1:N){
    y[n] ~ poisson(mu[n]);
  }
  
}
generated quantities{

vector[N] pred;
vector[N] log_lik;
vector[N] distances;

  for (n in 1:N){

    pred[n] = poisson_rng(mu[n]);
    log_lik[n] = poisson_lpmf(y[n] | mu[n]);
    distances[n] = squared_distance(K[from[n],],K[to[n],]);
  
  }
}

Making fake data.

library(tidyverse)
library(rstan)

n <- 20
set.seed(222)
x <- rnorm(n,0,1)
y <- rnorm(n,0,1)
fake_df <- tibble(x=x,y=y,id=1:n)
mymat <- matrix(rep(0,n*n),ncol=n)
rownames(mymat) <- 1:n
colnames(mymat) <- 1:n

diag(mymat) <- NA
mymat[upper.tri(mymat)] <- NA

mymat <- as.data.frame(mymat) %>%
  mutate(from=row.names(.)) %>%
  gather(to,value,-from) %>%
  filter(from!=to) %>% filter(complete.cases(.)) %>%
  select(-value)
euclidean <- function(a, b) sqrt(sum((a - b)^2))
mymat$edist <- NA
for (i in 1:nrow(mymat)){
  edist <- euclidean(unlist(fake_df[as.numeric(mymat$from[i]),1:2]),
            unlist(fake_df[as.numeric(mymat$to[i]) ,1:2]))
  mymat$edist[i] <- edist
  mymat$valuenottoonoisy[i] <- rpois(1,exp(3-edist+rnorm(1,0,0.1)))
  mymat$valuenoisy[i] <- rpois(1,exp(3-edist+rnorm(1,0,1.25)))
}
words <- as.character(1:n)
data <- mymat 
data$from_n <- as.numeric(factor(data$from,words))
data$to_n <- as.numeric(factor(data$to,words))

plot(data$edist,data$valuenottoonoisy)
plot(data$edist,data$valuenoisy)

stan_data <- list(N=nrow(data),
                  M=length(unique(c(data$from,data$to))),
                  D=2,
                  y=data$valuenottoonoisy,
                  from=data$from_n,
                  to=data$to_n)

Fitting the data to the not too noisy data.

model <- stan_model(file = "model.stan")
fit <- sampling(model,data=stan_data, chains = 4, cores = 4, iter = 2000,
                control=list(max_treedepth=15))

as.data.frame(summary(fit)$summary) %>% 
  filter(grepl("distance",row.names(.))) %>% pull(Rhat) %>%
  summary()

Min. 1st Qu. Median Mean 3rd Qu. Max.
0.9991 0.9997 0.9999 1.0000 1.0003 1.0024

That’s good.

stan_data$y <- data$valuenoisy
fit <- sampling(model,data=stan_data, chains = 4, cores = 4, iter = 2000,
                control=list(max_treedepth=15))

as.data.frame(summary(fit)$summary) %>% 
  filter(grepl("distance",row.names(.))) %>% pull(Rhat) %>%
  summary()

Min. 1st Qu. Median Mean 3rd Qu. Max.
0.9998 1.0604 1.2471 1.5819 1.7158 8.9999

That’s very bad.

stan_data <- read_rds("https://github.com/justinsavoie/personal_website/raw/main/real_data.rds")
fit <- sampling(model,data=stan_data, chains = 4, cores = 4, iter = 2000,
                control=list(max_treedepth=15))

as.data.frame(summary(fit)$summary) %>% 
  filter(grepl("distance",row.names(.))) %>% pull(Rhat) %>%
  summary()

Min. 1st Qu. Median Mean 3rd Qu. Max.
0.9992 1.0334 1.2037 1.7067 1.8082 17.6371

That’s very bad, also.

I wondered if anyone has advice concerning such a latent space model where the problem isn’t invariance, but rather, the fact that some the data generating process is very noisy. Priors? Different specification of the latent space? What do people generally do to fit a latent space to not very collaborative data? Thanks in advance!

1 Like

I’m running into similar issues (although atm I’m using PyMC) when implementing network models for binary data. I think the problem is just very hard in general - you’re trying to identify MxD parameters from at most MxM observations; and although Poisson gives you a bit more information than Bernoulli (in my case), it’s still not a lot.

One thing that helps (at least in model development & debugging) is to make the model hierarchical and assume you observe P instances of the network; when simulating from the prior you can increase P and for higher values your model should converge more easily. Doesn’t help you when you have only one actual network though :/