# New Paper Incorporating Map_Rect / Cluster Computing

Hi Stan fans -

I have a draft out of a new paper that uses Stan and MPI parallelization to scale Twitter feeds in the Middle East. We run a model with ~20k parameters and ~20m rows and finish well within finite time! It’s also a new twist on item response theory that pulls from ideas in multivariate time series analysis.

Link to the paper on OSF here: https://osf.io/preprints/socarxiv/wykmj
Github with source code/data here: https://github.com/saudiwin/arab_polarization

I don’t have our nitty gritty cmdstan/cluster scripts on there, but if you are interested I can pass those along.

12 Likes

I’m quite excited to learn from this, thank you! @saudiwin is there any chance you could share the non map_rect Stan model with me?

I don’t have the exact same model coded without `map_rect`, but here is a simpler 1-dimensional version of the model. I should mention that this version uses a Normally-distributed outcome as we standardized the retweet counts in an earlier version of the paper. In the current version we use Poisson as one reviewer had an allergic reaction to standardization in an earlier draft.

``````data {
int<lower=1> J;              // number of elites
int<lower=1> K;              // number of citizens
int<lower=1> N;              // number of observations
int<lower=1> T;  //number of time points
int<lower=1,upper=J> jj[N];  // elite ID for observation n
int<lower=1,upper=K> kk[N];  // citizen ID for observation n
int<lower=1> tt[N]; // t for observation N
real y[N];   // outcome for observation n
vector[T-1] time_gamma; //binary vector indicating when coup happens
int country_code[N]; //indicator for Tunisia
}
parameters {
vector[K] delta_1;                  // non-zero discriminations
vector[K] delta_0;                  // zero discriminations
matrix[T,J] alpha;               // time-varying ideal points for elites
vector[K] beta_0;                // zero difficulty of question k
vector[K] beta_1;                // zero difficulty of question k
vector[4] alpha_int; //drift
vector[4] betax; //effects of coup
real<lower=0> country; //dummy for country-level fixed effects
vector<lower=0>[4] sigma_time; //heteroskedastic variance by ideological group
real<lower=0> sigma_beta_0; //hierarchical variance for zero betas
real<lower=0> sigma_beta_1; //hierarchical variance for zero betas
real<lower=0> sigma_overall; //variance for top-level normal distribution
}

transformed parameters {
//constrain one intercept for identification
//vector[4] alpha_int_full;
//alpha_int_full = append_row(alpha_int,-sum(alpha_int));
}

model {
to_vector(alpha[1,]) ~ normal(0,3);
alpha_int ~ normal(0,3);
country ~ exponential(.1);
sigma_time ~ exponential(.1);
sigma_overall ~ exponential(.1);
sigma_beta_0 ~ exponential(.1);
sigma_beta_1 ~ exponential(.1);

//VAR priors with constraints on who influences who

alpha[2:T,1] ~ normal(alpha_int[1] + adj_in[1]*alpha[1:(T-1),1] +
betax[1]*time_gamma,
sigma_time[1]);

alpha[2:T,2] ~ normal(alpha_int[2] + adj_in[2]*alpha[1:(T-1),2] +
betax[2]*time_gamma,
sigma_time[2]);
alpha[2:T,3] ~ normal(alpha_int[3] + adj_in[3]*alpha[1:(T-1),3] +
betax[3]*time_gamma,
sigma_time[3]);

alpha[2:T,4] ~ normal(alpha_int[4] + adj_in[4]*alpha[1:(T-1),4] +
betax[4]*time_gamma,
sigma_time[4]);

//citizen priors
beta_0 ~ normal(0,sigma_beta_0);
beta_1 ~ normal(0,sigma_beta_1);
delta_1 ~ normal(0,3);
delta_0 ~ normal(0,3);
// loop over outcome
// use hurdle model for missing data (-9999)
// conditional on passing hurdle, use normal distribution IRT model with time-varying
// parameters for elites

for(n in 1:N) {
if(y[n]==-9999) {
1 ~ bernoulli_logit(delta_0[kk[n]]*(alpha[tt[n],jj[n]] + country*country_code[n])  - beta_0[kk[n]]);
} else {
0 ~ bernoulli_logit(delta_0[kk[n]]*(alpha[tt[n],jj[n]] + country*country_code[n]) - beta_0[kk[n]]);
y[n] ~ normal(delta_1[kk[n]]*(alpha[tt[n],jj[n]] + country*country_code[n]) -
beta_1[kk[n]],
1);
}
}
}

generated quantities {
matrix[T,J] alpha_country; //recalculate alpha with country intercepts included

alpha_country[,1] = alpha[,1];
alpha_country[,2] = alpha[,2] + country;
alpha_country[,3] = alpha[,3];
alpha_country[,4] = alpha[,4] + country;
}``````
2 Likes

Cool. From the paper:

Employing a cluster computer system with 700 cores, we were able to estimate a converged chain within 48 hours.9 Based on conversations with the Stan team in August 2019, we believe that this is the largest such model estimated to date.

The `map_rect` model will work without multicore. In fact, it’s sometimes faster using `map_rect` way because incremental partial evaluation of the mapped derivatives, which increases memory locality (checkpointing as it’s known in the autodiff literature).