Multivariate prior for hierarchical model; missing something?

Merlin_Heidemanns · January 25, 2021, 9:53pm

This is a very straightforward question that has frustrated me over the last two days and maybe I am just missing an obvious error. I want to estimate polling errors for past US elections from polling data with errors being correlated across states. Elections are indexed by I, states by S. Xi is the correlated component and I followed the instructions in the manual and have done this before but I get the “Log probability evaluates to log(0), i.e. negative infinity.” issue. Any thoughts where the issue is?

data {
  int N;
  int I;
  int S;
  vector[S] nu;
  int i[N];
  int s[N];
  int y[N];
  int n[N];
  matrix[S, I] outcome;
}
transformed data {
  matrix[S, I] logit_outcome;
  logit_outcome = logit(outcome);
}
parameters {
  real alpha;
  vector[S] beta;
  vector<lower = 0>[S] sigma;
  matrix[S, I] raw_xi;
  cholesky_factor_corr[S] L_Omega;
}
transformed parameters {
  matrix[S, I] xi;
  xi = (diag_pre_multiply(sigma, L_Omega) * raw_xi);
}
model {
  vector[N] mu;
  L_Omega ~ lkj_corr_cholesky(1);
  sigma ~ student_t(nu, 0, 0.2);
  alpha ~ normal(0, 0.2);
  beta ~ normal(0, 0.2);
  to_vector(raw_xi) ~ std_normal();
  for (jj in 1:N){
    mu[jj] = logit_outcome[s[jj], i[jj]] + alpha + beta[s[jj]] + xi[s[jj], i[jj]];
  }
  y ~ binomial_logit(n, mu);
}

bbbales2 · January 25, 2021, 10:23pm

Ooof, not obvious. I’d start with some commenting-out tech.

For instance see if this runs:

for (jj in 1:N) {
  mu[jj] = 0.0;
}

If it does, track back from there.

Maybe also check the input constraints. Like:

int<lower=1> n[N];
int<lower=1, upper=n> y[N]; // I'm actually not sure this will work, but try

Merlin_Heidemanns · January 25, 2021, 10:25pm

So I can say that this below without the multivariate part works just fine which I should have added.

data {
  int N;
  int I;
  int S;
  vector[S] nu;
  int i[N];
  int s[N];
  int y[N];
  int n[N];
  matrix[S, I] outcome;
}
transformed data {
  matrix[S, I] logit_outcome;
  logit_outcome = logit(outcome);
}
parameters {
  real alpha;
  vector[S] beta;
  vector<lower = 0>[S] sigma;
  matrix[S, I] xi;
  cholesky_factor_corr[S] L_Omega;
}
model {
  vector[N] mu;
  L_Omega ~ lkj_corr_cholesky(1);
  sigma ~ student_t(nu, 0, 0.2);
  alpha ~ normal(0, 0.2);
  beta ~ normal(0, 0.2);
  to_vector(xi) ~ std_normal();
  for (jj in 1:N){
    mu[jj] = logit_outcome[s[jj], i[jj]] + alpha + beta[s[jj]] + xi[s[jj], i[jj]];
  }
  y ~ binomial_logit(n, mu);
}

bbbales2 · January 25, 2021, 10:29pm

sigma ~ student_t(nu, 0, 0.2);

Are those nu s positive?

Otherwise, still comment out tech.

See if this works:

for (jj in 1:N){
  mu[jj] = logit_outcome[s[jj], i[jj]] + alpha + beta[s[jj]];
}

Merlin_Heidemanns · January 25, 2021, 10:47pm

All nus are positive and integers.

The below does not work:

data {
  int N;
  int I;
  int S;
  vector[S] nu;
  int i[N];
  int s[N];
  int y[N];
  int n[N];
  matrix[S, I] outcome;
}
transformed data {
  matrix[S, I] logit_outcome;
  logit_outcome = logit(outcome);
}
parameters {
  real alpha;
  vector[S] beta;
  vector<lower = 0>[S] sigma;
  matrix[S, I] raw_xi;
  cholesky_factor_corr[S] L_Omega;
}
transformed parameters {
  matrix[S, I] xi;
  xi = (diag_pre_multiply(sigma, L_Omega) * raw_xi);
}
model {
  vector[N] mu;
  L_Omega ~ lkj_corr_cholesky(1);
  sigma ~ student_t(nu, 0, 0.2);
  alpha ~ normal(0, 0.2);
  beta ~ normal(0, 0.2);
  to_vector(raw_xi) ~ std_normal();
  for (jj in 1:N){
    mu[jj] = logit_outcome[s[jj], i[jj]] + alpha + beta[s[jj]];
  }
  y ~ binomial_logit(n, mu);
}

This works just fine

data {
  int N;
  int I;
  int S;
  vector[S] nu;
  int i[N];
  int s[N];
  int y[N];
  int n[N];
  matrix[S, I] outcome;
}
transformed data {
  matrix[S, I] logit_outcome;
  logit_outcome = logit(outcome);
}
parameters {
  real alpha;
  vector[S] beta;
}

model {
  vector[N] mu;
  alpha ~ normal(0, 0.2);
  beta ~ normal(0, 0.2);
  for (jj in 1:N){
    mu[jj] = logit_outcome[s[jj], i[jj]] + alpha + beta[s[jj]];
  }
  y ~ binomial_logit(n, mu);
}

bbbales2 · January 25, 2021, 10:49pm

Hmm, I guess comment out ~ statements one at a time until you find the culprit.

Merlin_Heidemanns · January 25, 2021, 10:50pm

I mean it has to be the xis, right?

bbbales2 · January 25, 2021, 10:52pm

Since you deleted xis from the mu loop, I kinda don’t think it is.

I’m suspicious about L_Omega ~ lkj_corr_cholesky(1);, but you’ll just need to comment out each of those one at a time.

Leave all the other calcs in there. For the purposes of getting the model to evaluate, it won’t matter if the posterior is improper or anything.

Merlin_Heidemanns · January 25, 2021, 11:18pm

Commenting out L_Omega makes it work. But isn’t this the right way of specifying the correlation matrix plus prior?

data {
  int N;
  int I;
  int S;
  vector[S] nu;
  int i[N];
  int s[N];
  int y[N];
  int n[N];
  matrix[S, I] outcome;
}
transformed data {
  matrix[S, I] logit_outcome;
  logit_outcome = logit(outcome);
}
parameters {
  real alpha;
  vector[S] beta;
  vector<lower = 0>[S] sigma;
  matrix[S, I] raw_xi;
  //cholesky_factor_corr[S] L_Omega;
}
transformed parameters {
  matrix[S, I] xi;
  //xi = (diag_pre_multiply(sigma, L_Omega) * raw_xi);
}
model {
  vector[N] mu;
  //L_Omega ~ lkj_corr_cholesky(1);
  sigma ~ student_t(nu, 0, 0.2);
  alpha ~ normal(0, 0.2);
  beta ~ normal(0, 0.2);
  to_vector(raw_xi) ~ std_normal();
  for (jj in 1:N){
    mu[jj] = logit_outcome[s[jj], i[jj]] + alpha + beta[s[jj]];
  }
  y ~ binomial_logit(n, mu);
}

bbbales2 · January 25, 2021, 11:34pm

Try L_Omega ~ lkj_corr_cholesky(1.0);

Looks right to me. Sometimes there can be integer/double problems.

bbbales2 · January 25, 2021, 11:34pm

Also does it work if the only thing you comment out is:

//L_Omega ~ lkj_corr_cholesky(1);

The rest of the stuff should stay.

Merlin_Heidemanns · January 25, 2021, 11:35pm

Just commenting out the prior doesn’t fix it. Running the other thing.

Merlin_Heidemanns · January 25, 2021, 11:36pm

1.0 in the prior also fails.

bbbales2 · January 25, 2021, 11:36pm

Ooo, okay, then I am worried about:

xi = (diag_pre_multiply(sigma, L_Omega) * raw_xi);

How about just:

matrix[S, I] xi;
xi = raw_xi;

Merlin_Heidemanns · January 25, 2021, 11:37pm

So init = 0.1 works.

Merlin_Heidemanns · January 25, 2021, 11:38pm

Unconstrained within -2, 2 doesn’t work. Setting init = 0 also doesn’t (underflow).

bbbales2 · January 25, 2021, 11:39pm

Oooo, dang good find

Merlin_Heidemanns · January 25, 2021, 11:39pm

Thanks for having a look at it!

bbbales2 · January 25, 2021, 11:40pm

Aye aye, happy to give misleading advice :P

Topic		Replies	Views
Error: Log probability evaluates to log(0), i.e. negative infinity. (Multivariate Gaussian) Modeling fitting-issues , performance	11	1808	December 28, 2021
Difficulties with Hierarchical State Space Model Modeling state-space , priors , hierarchical-model , reparametrization	5	1072	July 15, 2022
Log probability evaluates to log(0) for Noncentered Regression Modeling specification	1	157	March 30, 2024
Error evaluating the log probability at the initial value Modeling	8	1185	September 2, 2020
Mystified by "Log probability evaluates to log(0)..." (when none appear to evaluate to log(0) ) Modeling fitting-issues	3	2558	August 15, 2018

Multivariate prior for hierarchical model; missing something?

Related topics