Cholesky factor optimized multilevel model - group parameter matrix dimensions

Hello fellow Stan users!

Very basic question here from a new user of Stan. I’m attempting to implement my first multilevel model, and I’ve been referencing some code that appears in the Stan manual (p151 and screenshot below). As part of the cholesky factorization optimization, the ‘group coeffs’ matrix is declared as matrix[L,K] gamma, where L = # of group predictors and K = # of individual predictors. Could someone explain why its dimensions are [L,K]?

It’s also declared the same shape in the original model on 148.

I think it’s just that you use an individual’s group properties (of which there are L) to predict the mean of their individual properties (of which there are K).

row_vector[L] u[J] is the list of group predictors for each person, so u[j] * gamma gives a row_vector of length K (number of individual predictors) for each individual (which is then used to build a distribution over betas …).

Something like that? Not familiar with the model, just going by dimensionality analysis.

1 Like

Ah yes, that makes total sense now. Thanks for the clarification!

One other related question: what is exactly is contained in u[j]? You say it’s a list of group predictors for each person, but the dimensions are [J,L], not [N,L]. Does the input data frame need to have these group-level predictors pre-aggregated, or does this occur automatically within Stan? Again, apologies for the basic line of questioning, I’m still wrapping my head around this.

You say it’s a list of group predictors for each person, but the dimensions are [J,L], not [N,L].

I guess I meant group features for each group (of which there are J groups, and L features). My mistake ^^. I’m not actually familiar with this model.

Thanks again for your responses @bbbales2! I have an additional dimension question on this code block regarding the rows_dot_product function. It’s my understanding the matrices supplied to rows_dot_product (beta and x) need to have the same number of rows; however, in this case from the manual beta is a matrix[J,K] while X is matrix[N,K]. What am I missing?

Even though beta is J x K, beta[jj] is N x K because jj is size N; it evaluates so that

beta[jj][i] == beta[jj[i]]

edit: I’m wrong, see Bob’s stuff

The code isn’t actually multiplying two matrices:

y ~ normal(rows_dot_product(beta[jj] , x), sigma);

beta[jj] is a row_vector, and x is a matrix, it just dots the row beta[jj] by each row of x.

Should be the same answer as beta[jj] * x' or x * beta[jj]' (the normal here isn’t gonna care if you feed it a vector or a row_vector)

No, beta[jj] is an N x K matrix. The types are here:

matrix[N, K] x;
matrix[J, K] beta;
int jj[N];

You need to use rows_dot_product as indicated.

1 Like