Nonidentifiable parameters in latent factor model for dyadic/network data

Dear Stan Community,

I am currently trying to fit a regression model introduced by Peter Hoff (discussed e.g. here or here) which includes variance components and latent multiplicative effects to deal with the second- and third-order dependence typical for social network data.

The model I am going for here is:

logit(y_{i,j} = 1) = \alpha + \textbf{x}_{i,j}^T \pmb{\beta } + a_i + b_j + \epsilon_{i,j} +\textbf{u}_i^T \Lambda \textbf{v}_j

where a_i and b_j are sender- and receiver-specific random effects and \epsilon_{ij} are dyadwise correlated standard normal random effects such that:

\{(a_1, b_1), ..., (a_n, b_n)\} \sim N(0, \Sigma_{ab})

\{(\epsilon_{ij}, \epsilon_{ji}: i \neq j)\} \sim N(0, \Sigma_{\epsilon})

and
\Sigma_{ab} = \left( \begin{array}{rrr} \sigma_a^2 & \sigma_{ab} \\ \sigma_{ab} & \sigma_b^2 \\ \end{array}\right), \Sigma_\epsilon = \left( \begin{array}{rrr} 1 & \rho \\ \rho & 1 \\ \end{array}\right).

So far so good, this part of the model (which is also known as the Social Relations Model) samples fine when fit alone (although it could surely be improved).

The problems are with the latent part where \textbf{u}_i and \textbf{v}_j are R-dimensional vectors representing unobserved sender and receiver characteristics and \pmb{\Lambda} is a R \times R diagonal matrix with weights (the whole term is supposed to capture structural features of a network such as triadic closure or stochastic equivalence).

This does not really seem identifiable to me as varying permutations of the signs of the parameters would lead to the same overall contribution, which seems also to be picked up by the chains:

The model is implemented as a probit variant and with a custom Gibbs sampler in the ‘amen’ R package but I am not really sure what they did to identify the latent terms.

Did I misunderstand something here or does anyone have a clue how to circumvent this issue? Any help would be greatly appreciated!

All the best,
Jakob

PS: Here is my stan model:

 
data {
  
  int<lower=0> N_nodes;                             // number of nodes
  int<lower=0> N_dyads;                             // number of pairs of nodes N*(N-1)
  int<lower=0> K;                                   // number of predictors
  int<lower=0> R;                                   // dimension of latent factors
  int<lower=1,upper=N_dyads> sender[N_dyads];       // sender ID of each tie
  int<lower=1,upper=N_dyads> receiver[N_dyads];     // receiver ID of each tie
  int<lower=0,upper=1> y[N_dyads];                  // binary tie outcome
  vector[K] x[N_dyads];                             // predictor matrix
  int<lower=1,upper=N_dyads/2> idx_eps[N_dyads, 2]; // index for assigning dyadic errors

}

parameters {
  
  real mu;
  vector[K] beta;

  vector[2] ab[N_nodes];
  corr_matrix[2] omega_ab;
  vector<lower=0>[2] tau_ab; 

  real<lower=0,upper=1> rho_epsilon;
  vector[2] z_epsilon[N_dyads/2];

  vector[R] u[N_nodes];
  vector[R] v[N_nodes];
  vector[R] lambda;

  real<lower=0> tau_u;
  real<lower=0> tau_v;

}

transformed parameters {
  
  vector[2] epsilon[N_dyads/2]; 
  cov_matrix[2] sigma_epsilon;
  matrix[2,2] sigma_epsilon_chol;
  matrix[R,R] Lambda;

  sigma_epsilon[1,1] = 1;
  sigma_epsilon[2,2] = 1;
  sigma_epsilon[1,2] = rho_epsilon;
  sigma_epsilon[2,1] = sigma_epsilon[1,2];
  sigma_epsilon_chol = cholesky_decompose(sigma_epsilon);

  for (i in 1:N_dyads/2) 
    epsilon[i] = sigma_epsilon_chol * z_epsilon[i];

  Lambda = diag_matrix(lambda);
  
}

model {

  // priors for intercept & regression coefficients
  
  mu ~ normal(0, 5);
  beta ~ normal(0, 2);
  
  // priors for sender & receiver varying effects
  
  omega_ab ~ lkj_corr(3);
  tau_ab ~ exponential(.5);
  ab ~ multi_normal(rep_vector(0, 2), quad_form_diag(omega_ab, tau_ab));

  // priors for dyad correlation

  rho_epsilon ~ beta(2, 2);
  for (i in 1:N_dyads/2) z_epsilon[i] ~ std_normal();

  // priors for latent factors


  tau_u ~ exponential(.5);
  tau_v ~ exponential(.5);

  for (n in 1:N_nodes) u[n] ~ normal(0, tau_u);
  for (n in 1:N_nodes) v[n] ~ normal(0, tau_v);
  lambda ~ normal(0,2);

  // likelihood
  {
    vector[N_dyads] eta;

    for (d in 1:N_dyads)
      eta[d] = mu + dot_product(x[d], beta) + ab[sender[d], 1] + 
      ab[receiver[d], 2] + epsilon[idx_eps[d,1], idx_eps[d,2]] + u[sender[d]]'*Lambda*v[receiver[d]];

    y ~ bernoulli_logit(eta);   
  }

}

generated quantities {

  vector[N_dyads] yrep;

  for (d in 1:N_dyads)
    yrep[d] = bernoulli_logit_rng(mu + dot_product(x[d], beta) + ab[sender[d], 1] + 
      ab[receiver[d], 2] + epsilon[idx_eps[d,1], idx_eps[d,2]] +  + u[sender[d]]'*Lambda*v[receiver[d]]);

}

Hi,
sorry for not picking this up sooner.

Great that you tried the simple model first, this is very good practice!

I didn’t try to verify this claim, but if this is the case, it is indeed a huge issue.

Just to be clear, if I expand the matrix notation (which I am not great at), I get:

\textbf{u}_i^T \Lambda \textbf{v}_j = \sum_k u_{i,k} \lambda_k v_{j,k}

where \lambda_k are the diagonal elements of \Lambda - is that correct? If that is so, I don’t think you have other option than to fix some of the signs. I am currently out of time so can’t really work out how to preserve the full flexibility, but it feels it should not be difficult (narrator: it may still be difficult).

Note that it sometimes happens that published packages just ignore this kind of problems as Gibbs sampler doesn’t have strong diagnostics to show it.

Hope that helps! If not, I could try to think about it more thoroughly sometime later.