Computing the log-likelihood using three-dimensional arrays

Hemingway · October 27, 2020, 3:53am

Hello everyone.
I’m trying to compute the log-likelihood in the generated quantities block for a space-time model.
Y[t] and mu[t] are both vectors of length N at each time t.
Any suggestions will be greatly appreciated.

data {
    int<lower=1> N;  // num of observations
    int<lower=1> K; // num of covariates
    int<lower=1> T; // num of time points
    matrix[N,K] X[T] ; // design matrix
    matrix[N,N] I; // identity matrix
…
}   
transformed parameters {
vector[N] Y[T]; // the response at time t is a vector of length N
vector[N] mu[T]; // mean 
real<lower=0> var_nug; // nugget
for (t in 1:T){  mu[t] = X[t] * beta; }
…
}
 model {
…
target += multi_normal_cholesky_lpdf(Y[t] | mu[t],  
cholesky_decompose(var_nug * I) ) ; // usually a more complex covariance structure
…
}
generated quantities {
}

jonah · October 31, 2020, 9:47pm

I might be misunderstanding (definitely let me know if I am), but I think you can just do the same thing in generated quantities as in the model block but store the values. Something like this:

vector[T] log_lik;
for (t in 1:T)  {
  log_lik[t] = multi_normal_cholesky_lpdf(Y[t] | mu[t],  
cholesky_decompose(var_nug * I) ) ;
}

Does that work?

And sorry it took almost a week for someone to respond!

Hemingway · November 1, 2020, 10:09pm

Hi @jonah, thank you so much for your time.

I had thought that the log_lik (vector[T] log_lik) had the same dimensions as the response variable: vector[N] Y[T] which is a three-dimensional array (e.g. N= 50 observations on T =10 time intervals), and hence my confusion?!
Thank you

jonah · November 1, 2020, 11:12pm

Oh I see, good question. Hopefully I can explain this correctly and coherently. It’s really about the number of observations, not the number of values. So it depends what you consider an observation to be. I was assuming you were considering the whole vector at each time point to be an observation. So that’s why I used vector[T]. If you instead model each of the N*T values individually, then in generated quantities you could collapse them into a long vector[N*T] and that would be compatible with the loo package.

Hemingway · November 2, 2020, 12:22am

Thank you @jonah for your prompt response.
Sorry for my confusion.
Yes, I’m trying to model the N*T values individually and I had defined the log_lik as vector[N] log_lik[T];
Any idea how to compute the log_lik in this case? Do I have to double loop thru each time points T and N?
I assume that this below is not right since I’m getting all the elements of log_lik[N,t=1] with the same values (log_lik[1,1] =-80.59897, log_lik[1,2] = -80.59897, ... ,log_lik[1,50]=-80.59897 )

generated quantities {
vector[N] log_lik[T];
for (t in 1:T) {
for (n in 1:N) {
log_lik[t][n] = multi_normal_cholesky_lpdf(Y[t] | mu[t],
cholesky_decompose(var_nug * I) ) ;
}
}
}

I’m attaching the codes and data.
Thank you for your time.
model.txt (1.2 KB) toy_data_list.R (37.3 KB)

jonah · November 2, 2020, 7:21pm

No problem at all. And this stuff is definitely confusing for me too!

If using a univariate distribution for each of the T*N points then you could do that, but because you’re using a multivariate distribution you’re only going to get one log-lik value for each of the T vectors of length N, that is you’ll get T unique values and not T*N unique values, which is related to this:

Yeah that’s because

multi_normal_cholesky_lpdf(Y[t]| ...)

calculates a single log pdf value for the entire vector y[t] of length N. So you get different values looping over T but the same values for the loop over N because you’re using a multivariate distribution. To make this more concrete, here’s an example with the multivariate normal distribution in R code:

library(mvtnorm)
y <- c(0, 0.5, 1) 
mu <- c(0, 0, 0)
Sigma <- diag(1, nrow = 3, ncol = 3)
dmvnorm(y, mu, Sigma) # returns 1 pdf value not 3 values

To calculate the log-likelihood for each individual element I think you’d need to use the properties of the multivariate normal distribution to derive what you need for the individual points. I might be mistaken, but I don’t really think that’s what you want to do here because in your model you’re modeling the entire vector y[t] (of length N) jointly.

But in case I’m wrong here, let’s see if the real expert @avehtari has a second to check my reasoning here. Aki, if you have a minute at some point can you double check that I’m not giving @Hemingway the wrong advice here? Thanks!

avehtari · November 3, 2020, 5:12pm

@jonah is describing this very well, I have nothing to add.

Hemingway · November 3, 2020, 11:49pm

Thank you so much @jonah (and @avehtari) for your time.

jonah:

Yeah that’s because
multi_normal_cholesky_lpdf(Y[t]| ...)
calculates a single log pdf value for the entire vector y[t] of length N. So you get different values looping over T but the same values for the loop over N because you’re using a multivariate distribution.

This is very helpful @jonah and it makes a lot of sense now.
Regards

Topic		Replies	Views
Extracting point-wise log-likehood from multivariate normal model General	6	644	April 12, 2023
Computing log-likelihood for censored data Modeling	0	378	July 17, 2019
How to calculate log posterior probability in transformed parameters block Modeling specification	2	652	February 16, 2024
How to calculate log_lik in generated quantities of a multivariate regression model CmdStan cmdstan , loo	15	3464	March 9, 2022
Generated quantities log likelihood calculation fail using a nested for loop Modeling rstan	2	334	August 21, 2023

Computing the log-likelihood using three-dimensional arrays

Related topics