I am estimating a multilevel model where for each group, a set of 2-element vectors \alpha, \beta, w is used, with the constraints
I chose impose a multivariate normal on \text{logit}(\alpha), \text{logit}(\beta), \log(w) (out of convenience).
I implement it in Stan using
// logit transformation and log Jacobian correction
real logit_lp(real x) {
real y;
y = log(x);
target += -y - log1m(x);
return y;
}
/* Cross sectional distribution of group-specific parameters.
Transformed parameters ~ multi_normal_cholesky(cs_mean, cs_L)
Tws > 0, 0 < alphas, betas < 1
*/
void cross_sectional_distribution_lp(vector Tws, vector alphas, vector betas,
vector cs_mean, matrix cs_L) {
vector[6] z;
z[1:2] = log(Tws);
target += -sum(z[1:2]); // log Jacobian adjustment for y = log(x)
z[3] = logit_lp(alphas[1]);
z[4] = logit_lp(alphas[2]);
z[5] = logit_lp(betas[1]);
z[6] = logit_lp(betas[2]);
z ~ multi_normal_cholesky(cs_mean, cs_L);
}
There are many groups. I found that it is significantly (about 4x) faster if I pre-allocate a buffer for z
outside the function call, and pass it to the functions.
I am wondering if I could run into any trouble though, eg if Stan implicitly parallelizes code and two such functions could access it concurrently, or something like that.