Relative efficeincy for 3 observations are NaN

data {
    //number of observations
    int<lower=0> N_obs;
    //number of groups
    int<lower=0> Ngroups;
    //number of columns in the design matrix
    int K;
    //the column containing the groups
    int<lower=1> group[N_obs];
    //number of observations per plate
    vector[N_obs] n_t;
    // response
    vector[N_obs] y_obs;
    //vector[N_obs] timevar;
    //design matrix
    matrix [N_obs, K] X_obs;
    //vector of S2
    vector[N_obs] S2;

//transfored data
transformed data {
  //observed data transformation
  matrix[N_obs, K] Q_ast;
  matrix[K, K] R_ast;
  matrix[K, K] R_ast_inverse;
  // thin and scale the QR decomposition
  Q_ast = qr_thin_Q(X_obs) * sqrt(N_obs - 1);
  R_ast = qr_thin_R(X_obs) / sqrt(N_obs - 1);
  R_ast_inverse = inverse(R_ast);

// The parameters accepted by the model. Our model
parameters {

// regression coefficient for mean structure
  real alpha; //intercept 
  vector[K] theta;
  real<lower=0> sigma;
  real<lower=0> nu;
  real<lower=0> sigmaint; //variance for random intercept
  vector[Ngroups] etaint; //reparameterization for the random effects

transformed parameters {

  vector[Ngroups] ranint; //random intercepts
  vector[N_obs] yhat;
  //computing the random intercept
  ranint  = sigmaint * etaint; 
  //mean function
  for (i in 1:N_obs)
    yhat[i] = Q_ast[i] * theta + ranint[group[i]] + alpha;

// The model to be estimated. 
model {
  vector[N_obs] b2;
  // priors
  alpha ~ cauchy(0, 10); //prior for the intercept
  sigma ~ scaled_inv_chi_square(nu, 0.4);
  nu ~ gamma(2, 0.1);
  etaint ~ normal(0, 1);
  //constrain such that (n - 1)S^2 / sigma^2 is chi^2(n - 1)
  for(i in 1:N_obs)
    b2[i] = ((n_t[i] - 1) * S2[i]) / pow(sigma, 2);
  //jacobian such that we sample from the target distribution for b2
  for(i in 1:N_obs)
    b2[i] ~ chi_square(n_t[i] - 1); //distribution for b2
 target += log(n_t - 1) + log(S2) - 2*log(pow(sigma, 2)); 

  // likelihood
  for (n in 1:N_obs)
    y_obs[n] ~ normal(yhat[n], sigma/pow(n_t[n], 0.5));

generated quantities {
  vector[K] beta;
  vector[N_obs] log_lik;
  beta = R_ast_inverse * theta; // coefficients on x
  for (n in 1:N_obs)
    log_lik[n] = normal_lpdf(y_obs[n]| yhat[n], sigma/pow(n_t[n], 0.5));

I am building a model for \bar{y} \sim N(\mu, \frac{\sigma}{\sqrt{n}}), such that \frac{(n - 1) S^2}{\sigma^2} \sim \chi^2. I did this because \sigma^2 was of interest and the data contained S^2 and n - 1 for each row. The model converged without any problems, but whenever I try to apply the loo function, I get NaN for the relative efficiency of 3 observations, so I am unable to compute loo. Note, I am trying to compare this model with a model where I don’t estimate \sigma^2, but the standard error directly i.e. \bar{y} \sim N(\mu, \sigma_{ \bar{y} }).

1 Like

What does monitor() function tell about log_lik for these observations?

Almost all log_lik are negative and large. The SD values are also large.

I should have been more specific. What are the values for Rhat, Bulk-ESS and Tail-ESS (or n_eff in the old versions) for log_lik?

All Rhat values = 1, min(Bilk_ESS)=1822, min(Tail_ESS)=2340

Can you provide the log_lik draws for those three problematic cases?

(Q5 Q50 Q95 Mean SD Rhat Bulk_ESS Tail_ESS)

log_lik[8] -37007.5 -34797.9 -32695.3 -34803.2 1310.8 1 2243 2734

log_lik[13] -1257.7 -1117.3 -975.5 -1116.0 85.9 1 3249 3055

log_lik[71] -68937.2 -66907.2 -64785.4 -66890.2 1266.5 1 3246 2914

I am actually thinking probably something is wrong with my implementation of the model, or am I mistaken there?

If you send me the posterior draws for log_lik[8], log_lik[13], and log_lik[71] (not just the summary), I can test myself what happens in loo.

I will do that latest tomorrow morning

It is not easy for me to save the posterior draws in a text file that I can easily upload here. I did save it in a text file, and it looks confusing to me, is it possible to send you a RData file containing only that?


Hello @avehtari, I figured out the problem and fixed it. The likelihood as computed in the code isn’t the right likelihood. The estimate for \sigma^2 is blown up, which I think results to the NaN computed in the loo package. I will share a full code as regards this later. Took quite a while to figure it out.

1 Like