Multivariate prior for hierarchical model; missing something?

This is a very straightforward question that has frustrated me over the last two days and maybe I am just missing an obvious error. I want to estimate polling errors for past US elections from polling data with errors being correlated across states. Elections are indexed by I, states by S. Xi is the correlated component and I followed the instructions in the manual and have done this before but I get the “Log probability evaluates to log(0), i.e. negative infinity.” issue. Any thoughts where the issue is?

data {
  int N;
  int I;
  int S;
  vector[S] nu;
  int i[N];
  int s[N];
  int y[N];
  int n[N];
  matrix[S, I] outcome;
}
transformed data {
  matrix[S, I] logit_outcome;
  logit_outcome = logit(outcome);
}
parameters {
  real alpha;
  vector[S] beta;
  vector<lower = 0>[S] sigma;
  matrix[S, I] raw_xi;
  cholesky_factor_corr[S] L_Omega;
}
transformed parameters {
  matrix[S, I] xi;
  xi = (diag_pre_multiply(sigma, L_Omega) * raw_xi);
}
model {
  vector[N] mu;
  L_Omega ~ lkj_corr_cholesky(1);
  sigma ~ student_t(nu, 0, 0.2);
  alpha ~ normal(0, 0.2);
  beta ~ normal(0, 0.2);
  to_vector(raw_xi) ~ std_normal();
  for (jj in 1:N){
    mu[jj] = logit_outcome[s[jj], i[jj]] + alpha + beta[s[jj]] + xi[s[jj], i[jj]];
  }
  y ~ binomial_logit(n, mu);
}

Ooof, not obvious. I’d start with some commenting-out tech.

For instance see if this runs:

for (jj in 1:N) {
  mu[jj] = 0.0;
}

If it does, track back from there.

Maybe also check the input constraints. Like:

int<lower=1> n[N];
int<lower=1, upper=n> y[N]; // I'm actually not sure this will work, but try

So I can say that this below without the multivariate part works just fine which I should have added.

data {
  int N;
  int I;
  int S;
  vector[S] nu;
  int i[N];
  int s[N];
  int y[N];
  int n[N];
  matrix[S, I] outcome;
}
transformed data {
  matrix[S, I] logit_outcome;
  logit_outcome = logit(outcome);
}
parameters {
  real alpha;
  vector[S] beta;
  vector<lower = 0>[S] sigma;
  matrix[S, I] xi;
  cholesky_factor_corr[S] L_Omega;
}
model {
  vector[N] mu;
  L_Omega ~ lkj_corr_cholesky(1);
  sigma ~ student_t(nu, 0, 0.2);
  alpha ~ normal(0, 0.2);
  beta ~ normal(0, 0.2);
  to_vector(xi) ~ std_normal();
  for (jj in 1:N){
    mu[jj] = logit_outcome[s[jj], i[jj]] + alpha + beta[s[jj]] + xi[s[jj], i[jj]];
  }
  y ~ binomial_logit(n, mu);
}
sigma ~ student_t(nu, 0, 0.2);

Are those nu s positive?

Otherwise, still comment out tech.

See if this works:

for (jj in 1:N){
  mu[jj] = logit_outcome[s[jj], i[jj]] + alpha + beta[s[jj]];
}

All nus are positive and integers.

The below does not work:

data {
  int N;
  int I;
  int S;
  vector[S] nu;
  int i[N];
  int s[N];
  int y[N];
  int n[N];
  matrix[S, I] outcome;
}
transformed data {
  matrix[S, I] logit_outcome;
  logit_outcome = logit(outcome);
}
parameters {
  real alpha;
  vector[S] beta;
  vector<lower = 0>[S] sigma;
  matrix[S, I] raw_xi;
  cholesky_factor_corr[S] L_Omega;
}
transformed parameters {
  matrix[S, I] xi;
  xi = (diag_pre_multiply(sigma, L_Omega) * raw_xi);
}
model {
  vector[N] mu;
  L_Omega ~ lkj_corr_cholesky(1);
  sigma ~ student_t(nu, 0, 0.2);
  alpha ~ normal(0, 0.2);
  beta ~ normal(0, 0.2);
  to_vector(raw_xi) ~ std_normal();
  for (jj in 1:N){
    mu[jj] = logit_outcome[s[jj], i[jj]] + alpha + beta[s[jj]];
  }
  y ~ binomial_logit(n, mu);
}

This works just fine

data {
  int N;
  int I;
  int S;
  vector[S] nu;
  int i[N];
  int s[N];
  int y[N];
  int n[N];
  matrix[S, I] outcome;
}
transformed data {
  matrix[S, I] logit_outcome;
  logit_outcome = logit(outcome);
}
parameters {
  real alpha;
  vector[S] beta;
}

model {
  vector[N] mu;
  alpha ~ normal(0, 0.2);
  beta ~ normal(0, 0.2);
  for (jj in 1:N){
    mu[jj] = logit_outcome[s[jj], i[jj]] + alpha + beta[s[jj]];
  }
  y ~ binomial_logit(n, mu);
}

Hmm, I guess comment out ~ statements one at a time until you find the culprit.

I mean it has to be the xis, right?

Since you deleted xis from the mu loop, I kinda don’t think it is.

I’m suspicious about L_Omega ~ lkj_corr_cholesky(1);, but you’ll just need to comment out each of those one at a time.

Leave all the other calcs in there. For the purposes of getting the model to evaluate, it won’t matter if the posterior is improper or anything.

Commenting out L_Omega makes it work. But isn’t this the right way of specifying the correlation matrix plus prior?

data {
  int N;
  int I;
  int S;
  vector[S] nu;
  int i[N];
  int s[N];
  int y[N];
  int n[N];
  matrix[S, I] outcome;
}
transformed data {
  matrix[S, I] logit_outcome;
  logit_outcome = logit(outcome);
}
parameters {
  real alpha;
  vector[S] beta;
  vector<lower = 0>[S] sigma;
  matrix[S, I] raw_xi;
  //cholesky_factor_corr[S] L_Omega;
}
transformed parameters {
  matrix[S, I] xi;
  //xi = (diag_pre_multiply(sigma, L_Omega) * raw_xi);
}
model {
  vector[N] mu;
  //L_Omega ~ lkj_corr_cholesky(1);
  sigma ~ student_t(nu, 0, 0.2);
  alpha ~ normal(0, 0.2);
  beta ~ normal(0, 0.2);
  to_vector(raw_xi) ~ std_normal();
  for (jj in 1:N){
    mu[jj] = logit_outcome[s[jj], i[jj]] + alpha + beta[s[jj]];
  }
  y ~ binomial_logit(n, mu);
}

Try L_Omega ~ lkj_corr_cholesky(1.0);

Looks right to me. Sometimes there can be integer/double problems.

Also does it work if the only thing you comment out is:

//L_Omega ~ lkj_corr_cholesky(1);

The rest of the stuff should stay.

Just commenting out the prior doesn’t fix it. Running the other thing.

1.0 in the prior also fails.

1 Like

Ooo, okay, then I am worried about:

xi = (diag_pre_multiply(sigma, L_Omega) * raw_xi);

How about just:

matrix[S, I] xi;
xi = raw_xi;

So init = 0.1 works.

Unconstrained within -2, 2 doesn’t work. Setting init = 0 also doesn’t (underflow).

Oooo, dang good find

Thanks for having a look at it!

Aye aye, happy to give misleading advice :P