The below Stan script, does work - it just confuses me why a nested for loop is even necessary. In the line pred = C * ipt + B * (day_diag * ipt); : ipt is a “one hot” vector, where all elements are set to 0, save for 1. The goal was simply to multiply the matrix, B, by the vector ipt, and scale all vector elements by the current value of days[n]. This seemed impossible so I created a diagonal matrix where every element along the diagonal equals the same thing. This works but it’s inefficient.
How can vector scalar multiplication be executed more efficiently? Has it purposely been omitted from the language?
data {
int K; //outcome classes, 3
int N; //num rows
int D; //input dimensions, 5
int Y[N];
matrix[N,D] X;
int days[N];
}
parameters {
matrix[D, K] C; //[5,3]
matrix[D, K] B; //[5,3]
}
model {
for (n in 1:N){
vector[K] pred;
vector[D] ipt;
matrix[K,K] day_diag;
for (i in 1:K){
for (j in 1:K){
if (i == j)
day_diag[i,j] = days[n];
else
day_diag[i,j] = 0;
}
}
ipt = X[n]';
pred = C * ipt + B * (day_diag * ipt);
Y[n]~categorical_logit(pred);
}
}
@nhuurre ,How does Stan process matrix multiplication?
In C * ipt I have a matrix vector multiplication of dimensions. [5,3] x [1,5]
I’m familiar with [3,5] x [5,1] => [3,1]. So the order in which Stan expects arguments for matrix-vector multiplication confuses me.
Indeed, [3,5] x [5,1] => [3,1] holds in Stan as well. The code compiles but that doesn’t mean it will run. If D and K are not equal the model will stop with something like “initialization failed: dimension mismatch in matrix multiplication.”
Also note that a vector[5] is a [5,1] matrix while a row_vector[5] is a [1,5] matrix.
@nhuurre Right, so why does this work? [5,3] x [1,5] for C * ipt shouldn’t work, but it does! That’s why I’m curious about expected argument order. Do you just always specify the matrix first and Stan will sort it out?
Uhoh, I’m supposed to have K output classes (3), so it looks that I made an error in my model design. Also, N is the number of data observations, ~1000.
This did the trick tho!
data {
int K; //outcome classes, 3
int N; //num rows
int D; //input dimensions, 5
int Y[N];
matrix[N,D] X;
int days[N];
}
parameters {
matrix[D, K] C; //[5,3]
matrix[D, K] B; //[5,3]
}
model {
for (n in 1:N){
row_vector[K] pred;
row_vector[D] ipt;
ipt = X[n];
// [1,5]*[5,3] + [1,5]*[5,3]
pred = ipt * C + (ipt * days[n]) * B;
Y[n]~categorical_logit(pred');
}
}