Dear Stan Community,
I am currently trying to fit a regression model introduced by Peter Hoff (discussed e.g. here or here) which includes variance components and latent multiplicative effects to deal with the second- and third-order dependence typical for social network data.
The model I am going for here is:
logit(y_{i,j} = 1) = \alpha + \textbf{x}_{i,j}^T \pmb{\beta } + a_i + b_j + \epsilon_{i,j} +\textbf{u}_i^T \Lambda \textbf{v}_j
where a_i and b_j are sender- and receiver-specific random effects and \epsilon_{ij} are dyadwise correlated standard normal random effects such that:
\{(a_1, b_1), ..., (a_n, b_n)\} \sim N(0, \Sigma_{ab})
\{(\epsilon_{ij}, \epsilon_{ji}: i \neq j)\} \sim N(0, \Sigma_{\epsilon})
and
\Sigma_{ab} =
\left( \begin{array}{rrr}
\sigma_a^2 & \sigma_{ab} \\
\sigma_{ab} & \sigma_b^2 \\
\end{array}\right),
\Sigma_\epsilon =
\left( \begin{array}{rrr}
1 & \rho \\
\rho & 1 \\
\end{array}\right).
So far so good, this part of the model (which is also known as the Social Relations Model) samples fine when fit alone (although it could surely be improved).
The problems are with the latent part where \textbf{u}_i and \textbf{v}_j are R-dimensional vectors representing unobserved sender and receiver characteristics and \pmb{\Lambda} is a R \times R diagonal matrix with weights (the whole term is supposed to capture structural features of a network such as triadic closure or stochastic equivalence).
This does not really seem identifiable to me as varying permutations of the signs of the parameters would lead to the same overall contribution, which seems also to be picked up by the chains:
The model is implemented as a probit variant and with a custom Gibbs sampler in the ‘amen’ R package but I am not really sure what they did to identify the latent terms.
Did I misunderstand something here or does anyone have a clue how to circumvent this issue? Any help would be greatly appreciated!
All the best,
Jakob
PS: Here is my stan model:
data {
int<lower=0> N_nodes; // number of nodes
int<lower=0> N_dyads; // number of pairs of nodes N*(N-1)
int<lower=0> K; // number of predictors
int<lower=0> R; // dimension of latent factors
int<lower=1,upper=N_dyads> sender[N_dyads]; // sender ID of each tie
int<lower=1,upper=N_dyads> receiver[N_dyads]; // receiver ID of each tie
int<lower=0,upper=1> y[N_dyads]; // binary tie outcome
vector[K] x[N_dyads]; // predictor matrix
int<lower=1,upper=N_dyads/2> idx_eps[N_dyads, 2]; // index for assigning dyadic errors
}
parameters {
real mu;
vector[K] beta;
vector[2] ab[N_nodes];
corr_matrix[2] omega_ab;
vector<lower=0>[2] tau_ab;
real<lower=0,upper=1> rho_epsilon;
vector[2] z_epsilon[N_dyads/2];
vector[R] u[N_nodes];
vector[R] v[N_nodes];
vector[R] lambda;
real<lower=0> tau_u;
real<lower=0> tau_v;
}
transformed parameters {
vector[2] epsilon[N_dyads/2];
cov_matrix[2] sigma_epsilon;
matrix[2,2] sigma_epsilon_chol;
matrix[R,R] Lambda;
sigma_epsilon[1,1] = 1;
sigma_epsilon[2,2] = 1;
sigma_epsilon[1,2] = rho_epsilon;
sigma_epsilon[2,1] = sigma_epsilon[1,2];
sigma_epsilon_chol = cholesky_decompose(sigma_epsilon);
for (i in 1:N_dyads/2)
epsilon[i] = sigma_epsilon_chol * z_epsilon[i];
Lambda = diag_matrix(lambda);
}
model {
// priors for intercept & regression coefficients
mu ~ normal(0, 5);
beta ~ normal(0, 2);
// priors for sender & receiver varying effects
omega_ab ~ lkj_corr(3);
tau_ab ~ exponential(.5);
ab ~ multi_normal(rep_vector(0, 2), quad_form_diag(omega_ab, tau_ab));
// priors for dyad correlation
rho_epsilon ~ beta(2, 2);
for (i in 1:N_dyads/2) z_epsilon[i] ~ std_normal();
// priors for latent factors
tau_u ~ exponential(.5);
tau_v ~ exponential(.5);
for (n in 1:N_nodes) u[n] ~ normal(0, tau_u);
for (n in 1:N_nodes) v[n] ~ normal(0, tau_v);
lambda ~ normal(0,2);
// likelihood
{
vector[N_dyads] eta;
for (d in 1:N_dyads)
eta[d] = mu + dot_product(x[d], beta) + ab[sender[d], 1] +
ab[receiver[d], 2] + epsilon[idx_eps[d,1], idx_eps[d,2]] + u[sender[d]]'*Lambda*v[receiver[d]];
y ~ bernoulli_logit(eta);
}
}
generated quantities {
vector[N_dyads] yrep;
for (d in 1:N_dyads)
yrep[d] = bernoulli_logit_rng(mu + dot_product(x[d], beta) + ab[sender[d], 1] +
ab[receiver[d], 2] + epsilon[idx_eps[d,1], idx_eps[d,2]] + + u[sender[d]]'*Lambda*v[receiver[d]]);
}