The final optimized model is:
...
parameters {
matrix[K, J] z;
cholesky_factor_corr[K] L_Omega;
vector<lower=0,upper=pi()/2>[K] tau_unif;
matrix[L, K] gamma; // group coeffs
real<lower=0> sigma; // prediction error scale
}
transformed parameters {
matrix[J, K] beta;
vector<lower=0>[K] tau; // prior scale
for (k in 1:K) tau[k] = 2.5 * tan(tau_unif[k]);
beta = u * gamma + (diag_pre_multiply(tau,L_Omega) * z)';
}
model {
to_vector(z) ~ std_normal();
L_Omega ~ lkj_corr_cholesky(2);
to_vector(gamma) ~ normal(0, 5);
y ~ normal(rows_dot_product(beta[jj] , x), sigma);
}
That transpose in the creation of beta
has always annoyed me. Partly because it’s easy to miss (solved if we update the docs to use transpose()
instead), but also partly because it makes me lose track of the shape of things; it’s weird to have z
as a K \times J matrix but beta
is J \times K.
Isn’t it more straight-forward to do:
model{
...
matrix[J, K] z;
...
}
transformed parameters {
...
beta = u * gamma + z*diag_post_multiply(L_Omega,tau) ;
...
}
...
Here’s some R code that (I think) proves it’s equivalent this way:
diag_pre_multiply = function(v,m){
diag(v)%*%m
}
diag_post_multiply = function(m,v){
m%*%diag(v)
}
K = 3
J = 5
tau = 1:K #K arbitrary values
L_Omega = matrix(1,ncol=K,nrow=K) #KxK matrix of arbitrary values
z = matrix(rnorm(K*J),nrow=K,ncol=J) #just some random data
tz = t(z) #transpose of z
#the current approach:
beta1 = t(diag_pre_multiply(v=tau,m=L_Omega) %*% z)
#mike's suggested alternate:
beta2 = tz %*% diag_post_multiply(m=L_Omega,v=tau)
#and they're equivalent:
all(beta1==beta2)