Hello,
I am working on a model for social-network (round-robin) data with a large number of correlated random effects. If anyone is familiar with the Social Relations Model, this is an extension that allows each component to vary across 3 external observers who each rate the dyadic behaviors within the network. My goal is simply to estimate the (co)variance components, and use them to calculate some intraclass correlation coefficients.
My program works (slowly), but one of my parameters (\sigma_p) mixes poorly (R-hat > 1.1 & trace plots show that chains do not mix, although their random walks all seem to stay in the same parameter space; see attached traceplot). My intuition is that the problem is due to estimating a very small (nearly zero) variance component (\sigma_p = 0.054). To be clear, I actually estimate standard deviations and correlations, then assemble covariance matrices in transformed parameters block, then take the cholesky transformation to use for sampling in the model block (syntax attached and pasted below).
I tried specifying a more informative half-normal prior for \sigma_p (only weakly informative priors for the rest) and increased the number of iterations to 4000 (N chains = 3), which takes 3 CPU hours, but it still won’t stabilize. I intend to fit this model to simulated data in a Monte Carlo study, so I would like to know how I can make my Stan syntax more efficient. For example, I read in another thread something about sampling on the unit scale and scaling by the estimated standard deviation. Would that help with sampling my random effects with the near-zero variance component? Would that mean drawing random effects using the (cholesky of the) correlation matrix, then multiplying the random effects by their SDs when I create yHat in the transformed parameters block?
Any other ideas would be helpful, thank you! My stan syntax and R syntax look as follows:
data {
// sample sizes at each level
int NDyads; // Level-1: possible dyads among observed nominations
int NPersons; // number of unique participants
int NRaters; // number of unique raters
// ID variables
int IDd[NDyads*NRaters]; // Dyad ID
int IDi[NDyads*NRaters]; // ID for first in dyad
int IDj[NDyads*NRaters]; // ID for second in dyad
int IDr[NDyads*NRaters]; // ID for rater
// observed data
matrix[NDyads*NRaters,2] Y; // observed bivariate outcome (DVs for each dyad, per rater)
// use range(Y) for priors
real rangeY;
}
parameters {
real<lower=1,upper=6> M; // marginal mean(Y) >> M
vector[NRaters] m; // Random intercept(of)Y >> m
matrix[NPersons, 2] AP; // correlated actor/partner effects >> A and P
matrix[NDyads, 2] E; // Correlated Dyadic effects >> E
matrix[NPersons, 2] ap[NRaters]; // correlated rater deviations from A and P >> a and p
// Hyperparameters of random effects
real<lower=0,upper=rangeY*2> sigmam; // sigmam: random intercept for raters
vector<lower=0,upper=rangeY*2>[2] sigmaAP; // sigmaAP: SDs of actor/partner effects
real<lower=-1,upper=1> rhoAP; // rhoAP: generalized reciprocity
real<lower=0,upper=rangeY*2> sigmaE; // sigmaE: relationship SD
real<lower=-1,upper=1> rhoE; // rhoE: dyadic reciprocity
vector<lower=0,upper=rangeY*2>[2] sigmaap; // sigmaap: SD Rater deviations from A and P
real<lower=-1,upper=1> rhoap; // rhoap: correlation rater ks rating of A and P
real<lower=0,upper=rangeY*2> sigmae; // sigmae: SD error /rater deviation from relship effect
real<lower=-1,upper=1> rhoe; // rhoe: correlation rater ks deviation from relationship effects Eij and Eji
}
transformed parameters {
matrix[NDyads*NRaters,2] yHat; // expected value of Y, given person dyad effects and rater deviations
matrix[2,2] covPerson; // person level covariance matrix
matrix[2,2] cholPerson; // person level Cholesky-covariances
vector[2] MPerson; // person-level means (actor/partner effects)
vector[2] SDPerson; // SDs of actor/partner effects
matrix[2,2] covDyad; // person level covariance matrix
matrix[2,2] cholDyad; // person level Cholesky-covariances
vector[2] MDyad; // person-level means (actor/partner effects)
vector[2] SDDyad; // SDs of actor/partner effects
matrix[2,2] covRaterPL; // Rater-Person level covariance matrix
matrix[2,2] cholRaterPL; // Rater-Person level Cholesky-covariances
vector[2] MRaterPL; // Rater-Person level means (actor/partner effects)
vector[2] SDRaterPL; // SDs of Rater-Person level
matrix[2,2] covRaterDL; // Rater-Dyad level covariance matrix
matrix[2,2] cholRaterDL; // Rater-Dyad level Cholesky-covariances
vector[2] MRaterDL; // Rater-Dyad level means (actor/partner effects)
vector[2] SDRaterDL; // SDs of Rater-Dyad level
for (i in 1:(NRaters*NDyads)) {
yHat[i, 1] = M + m[IDr[i]] + AP[IDi[i], 1] + ap[IDr[i], IDi[i], 1] + AP[IDj[i], 2] + ap[IDr[i], IDj[i], 2] + E[IDd[i], 1];
yHat[i, 2] = M + m[IDr[i]] + AP[IDj[i], 1] + ap[IDr[i], IDj[i], 1] + AP[IDi[i], 2] + ap[IDr[i], IDi[i], 2] + E[IDd[i], 2];
}
// put hyperparameters into matrices:
// Person level matrices
covPerson[1,1] = sigmaAP[1] * sigmaAP[1];
covPerson[2,2] = sigmaAP[2] * sigmaAP[2];
covPerson[1,2] = rhoAP * sigmaAP[1] * sigmaAP[2];
covPerson[2,1] = covPerson[1,2];
cholPerson = cholesky_decompose(covPerson);
// Dyad level matrices
covDyad[1,1] = sigmaE * sigmaE;
covDyad[2,2] = covDyad[1,1];
covDyad[1,2] = rhoE * sigmaE * sigmaE;
covDyad[2,1] = covDyad[1,2];
cholDyad = cholesky_decompose(covDyad);
// Rater*Person level matrices
covRaterPL[1,1] = sigmaap[1] * sigmaap[1];
covRaterPL[2,2] = sigmaap[2] * sigmaap[2];
covRaterPL[1,2] = rhoap * sigmaap[1] * sigmaap[2];
covRaterPL[2,1] = covRaterPL[1,2];
cholRaterPL = cholesky_decompose(covRaterPL);
// Rater*Dyad level matrices
covRaterDL[1,1] = sigmae * sigmae;
covRaterDL[2,2] = covRaterDL[1,1];
covRaterDL[1,2] = rhoe * sigmae * sigmae;
covRaterDL[2,1] = covRaterDL[1,2];
cholRaterDL = cholesky_decompose(covRaterDL);
}
model {
// priors
sigmaap[2] ~ normal(0, 0.05);
m ~ normal(0, sigmam);
for (i in 1:NPersons) AP[i] ~ multi_normal_cholesky(rep_vector(0, 2), cholPerson);
for (i in 1:NDyads) E[i] ~ multi_normal_cholesky(rep_vector(0, 2), cholDyad);
for (r in 1:NRaters) {
for (p in 1:NPersons) {
ap[r, p] ~ multi_normal_cholesky(rep_vector(0, 2), cholRaterPL);
}
}
for (i in 1:NRaters*NDyads) Y[i] ~ multi_normal_cholesky(yHat[i], cholRaterDL);
}
Running the model in rstan with the following code:
head(dyadData)
## load the package to fit models to data
library(rstan)
library(Rcpp)
## load the package to calculate model-fit information criteria
library(loo)
## Set standard rstan options
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores() - 1)
## The following syntax assumes the Stan syntax below ("RESRM_mod_simple.stan) is already stored in a
## separate file.
## There is a dyad-level data set in the R object "dyadData", which
## has 1 row per dyad and columns for:
## - ID1 and ID2: the person-level IDs of the first and second person in the dyad
## - Dyad: Indicating the dyad (ID1-ID2)
## - Rater: Indicating the rater rating the dyadic behaviour
## - AB: the outcome of actor ID1 mimicking partner ID2
## - BA: the outcome of actor ID2 mimicking partner ID1
## Rows do not have missing values
## list of knowns to pass as data
IDlevelsPersons <- unique(sort(c(dyadData$ID1, dyadData$ID2)))
IDlevelsDyads <- unique(sort(dyadData$Dyad))
IDlevelsRaters <- unique(sort(dyadData$Rater))
knowns <- list(NDyads = length(unique(dyadData$Dyad)),
NPersons = length(unique(c(dyadData$ID1, dyadData$ID2))),
NRaters = length(unique(c(dyadData$Rater))),
IDi = as.integer(factor(dyadData$ID1, levels = IDlevelsPersons)),
IDj = as.integer(factor(dyadData$ID2, levels = IDlevelsPersons)),
IDd = as.integer(factor(dyadData$Dyad, levels = IDlevelsDyads)),
IDr = as.integer(factor(dyadData$Rater, levels = IDlevelsRaters)),
Y = as.matrix(dyadData[c("mimicryAB","mimicryBA")]),
rangeY = diff(range(c(dyadData$mimicryAB, dyadData$mimicryBA), na.rm = TRUE)))
## list of unknowns to keep track of
fixedFX <- c("M")
ranScales <- c("sigmam", "sigmaAP", "rhoAP", "sigmaE", "rhoE", "sigmaap", "rhoap", "sigmae", "rhoe")
unknowns <- c(fixedFX, ranScales)
## compile model
SRM.IRR <- stan_model("RESRM_mod_simple.stan", model_name = "Rater extended SRM")
## fit model to data
out.IRR <- sampling(SRM.IRR, data = knowns, pars = unknowns, seed = 3141593,
chains = 3, iter = 4000)
## fixed effects, SDs and correlation of random effects
print(out.IRR, pars = c(fixedFX, ranScales), probs = c(0.025, 0.975),
digits_summary = 3)
Resulting in the following output:
> round(ranEfTables$summary, 3)
mean se_mean sd 2.5% 97.5% n_eff Rhat
M 2.996 0.015 0.306 2.590 3.687 405.615 1.005
sigmam 0.284 0.026 0.592 0.014 1.862 512.429 1.005
sigmaAP[1] 0.622 0.001 0.056 0.519 0.740 3042.239 1.001
sigmaAP[2] 0.303 0.002 0.054 0.195 0.407 615.776 1.005
rhoAP 0.717 0.003 0.101 0.496 0.891 934.604 1.004
sigmaE 0.682 0.001 0.038 0.612 0.758 1814.002 1.000
rhoE 0.745 0.001 0.050 0.639 0.835 1845.725 1.000
sigmaap[1] 0.340 0.001 0.025 0.292 0.391 1567.415 1.001
sigmaap[2] 0.054 0.010 0.026 0.016 0.109 6.643 1.346
rhoap -0.257 0.027 0.361 -0.864 0.550 177.619 1.022
sigmae 0.576 0.000 0.013 0.551 0.603 2835.810 1.004
rhoe 0.196 0.001 0.048 0.098 0.286 1120.716 1.001
Original data stems from the following Open Science Framework account: https://osf.io/b4nvf/
Thanks for thinking along!
DebbyRESRM_mod_simple.stan (5.1 KB)RESRM_rStan_Analysis_simple.R (2.7 KB)Pi_TracePlots.pdf (79.4 KB)