Equivalent models and problem with mixing when some parameters are latent

papachristoumarios · October 16, 2021, 8:05pm

I am working on the following generative model that generates a graph G(V, E) as follows for a network of size N:

I generate i.i.d. rank values r_i for each node from a truncated exponential distribution with parameter \lambda and support [0, 1].
For every pair of nodes (i, j) the probability of an undirected edge between i and j is f(i, j) = c^{-2+\max \{ r_i, r_j \}} for some 1 < c < e^\lambda.

Suppose that I have a realized network with M edges that corresponds to an N \times N adjacency matrix A[N, N] and an edge matrix E[M, 2] where each row represents an edge.

I have created two mathematically equivalent models:

First is a naive model where each pass costs O(N^2)

// naive.stan

data {
  int N;
  int A[N, N];
  real<lower=0, upper=1> ranks[N];
}

parameters {
  real<lower=1> c[L];
  real<lower=0> lambda;
}

model {
  ranks ~ exponential(lambda);

  for (i in 1:N) {
    for (j in 1:(i - 1)) {
      if (ranks[i] >= ranks[j]) {
        A[i, j] ~ bernoulli(pow(c, - 2 + ranks[i]));
      }
      else {
        A[i, j] ~ bernoulli(pow(c, - 2 + ranks[j]));
      }
  }
  }
}

The second model is equivalent to the first but more sophisticate: i.e. after a preprocessing step of O(M) can run in a pass in O(N) giving per step complexity O(N + M).

// optimized.stan
data {
  int N;
  int M;
  int E[M, 2];
  real<lower=0, upper=1> ranks[N];
}

parameters {
  real<lower=1> c;
  real<lower=0> lambda;
}

model {
  int ordering[N];
  real max_rank;
  int argmax_rank;
  int sizes[N];
  int neg_sizes[N];
  int ordered_edges[M, K];
  real ordered_ranks[N];
  ranks ~ exponential(lambda);
  ordering = sort_indices_desc(ranks);
  ordered_ranks = sort_desc(ranks);

  for (i in 1:N) {
    sizes[i] = 0;
  }

  for (m in 1:M) {
    for (k in 1:K) {
      ordered_edges[m, k] = ordering[E[m, k] + 1];
    }
  }

  for (m in 1:M) {
    argmax_rank = min(ordered_edges[m, 1:K]);
    sizes[argmax_rank] = sizes[argmax_rank] + 1;
  }

  for (i in 1:N) {
    sizes[i] ~ binomial(N - i, pow(c, - 2 + ordered_ranks[i]));
  }
}

I have the following problem (still with simulated data from the same model):

when I supply both models with all my data i.e. both structure (adjacency or edge matrix) and the ranks matrix then both models are able to yield the same results (and the 2nd model is faster; as expected).
when I supply both models only with the structural data (adjacency or edge matrix) but I leave the ranks as latent variables then the naive model is able to converge and the optimized model is not. More specifically I get a max_tree_depth warning after sampling with the optimized model and the results do not look good.

Thank you in advance

Topic		Replies	Views
Improving mixing for (sparse) CAR models Modeling	1	411	May 2, 2023
Identifiability issues with latent space network model Modeling	1	567	July 24, 2020
How could I improve this model? Modeling fitting-issues	2	450	March 22, 2021
Truncated DP for p-dimensional Bernoulli mixtures Modeling	0	338	February 3, 2019
Finally A Way to Model Discrete Parameters in Stan Modeling	17	7964	January 6, 2018

Equivalent models and problem with mixing when some parameters are latent

Related topics