Exception: bernoulli_logit_lpmf: Logit transformed probability parameter is nan, but must not be nan!

data {
  int<lower=2> K; // num topics
  int<lower=2> V; // num words
  int<lower=1> M; // num docs or proposals
  int<lower=1> N; // total word instances
  int<lower=1, upper=V> w[N]; // word n
  int<lower=1, upper=M> doc[N]; // doc ID for word n
  vector<lower=0>[K] alpha;
  vector<lower=0>[V] beta;
  
  int<lower=1> J; //num of Senators
  int<lower=1> N_obs;
  int<lower=1, upper=J> j[N_obs]; // Senator for observation n_obs
  int<lower=1, upper=M> m[N_obs]; // proposal for observation n_obs
  int<lower=0, upper=1> y[N_obs]; // vote of observation n_obsd
}

parameters {
  simplex[K] theta[M]; // topic dist for doc m
  simplex[V] phi[K]; // word dist for topic k
  real alpha_param[M];
  real beta_param[M];
  matrix[J, K] theta_comp;
  real<lower=0, upper=1> rho;
}

transformed parameters {
  matrix[K, K] Sigma;
  
  for (k1 in 1:K) {
    for (k2 in 1:K) {
      if (k1 == k2) {
        Sigma[k1, k2] <- 1;
      } else {
        Sigma[k1, k2] <- rho;
      }
    }
  }
  
}

model {
  for (m_i in 1:M) {
    theta[m_i] ~ dirichlet(alpha);
  }
  for (k in 1:K) {
    phi[k] ~ dirichlet(beta);
  }
  for (n in 1:N) {
    real gamma[K];
    for (k in 1:K) {
      gamma[k] = log(theta[doc[n], k]) + log(phi[k, w[n]]);
    }

    target += log_sum_exp(gamma); // likelihood;
  }
  
  alpha_param ~ normal(0, 4);
  beta_param ~ normal(0, 4);
  for (j_i in 1:J) {
        rho ~ uniform(0, 1);
        theta_comp[j_i] ~ multi_normal(rep_vector(0, K), Sigma);
  }
  
  for (n_obs in 1:N_obs) {
    real theta_param;
    for (k in 1:K) {
      theta_param += theta_comp[j[n_obs], k]*theta[m[n_obs], k];
    }
    y[n_obs] ~ bernoulli_logit(theta_param * beta_param[m[n_obs]] - alpha_param[m[n_obs]]);
  }
}

Hi. I am trying to implement Lauderdale and Clark (2014)'s (link here) multidimensional ideal point estimation by combining latent dirichlet allocation (LDA) and item response theory (IRT). In order to do that, I mainly referred to existing LDA (link here) and IRT (link here) rstan codes, and I modified some of the relevant parts according to the original Lauderdale and Clark’s article.

But when I try to run this rstan code with my data, it keeps producing the result as follows:

Chain 1: Rejecting initial value:
Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: bernoulli_logit_lpmf: Logit transformed probability parameter is nan, but must not be nan! (in ‘model810834e6348_4945c94bbacc08c2e5c812d9865b3c3c’ at line 72)

Do you have any idea on how to fix this issue? Thank you in advance and please let me know if there are any things that I need to clarify more.

This is initialized to not_a_number() and then doing theta_param += ... just keeps at as not_a_number(). My guess is that you meant to initialize it to zero, as in

real theta_param = 0;
1 Like

Thanks, Ben. Your solution works! But another problem occurs:

SAMPLING FOR MODEL ‘6c5aae3e61f1b35f037d068e566816b9’ NOW (CHAIN 1).
Chain 1: Exception: std::bad_alloc (in ‘modelb6707bfb29bd_6c5aae3e61f1b35f037d068e566816b9’ at line 53)
[origin: bad_alloc]
[1] “Error in sampler$call_sampler(args_list[[i]]) : "
[2] " Exception: std::bad_alloc (in ‘modelb6707bfb29bd_6c5aae3e61f1b35f037d068e566816b9’ at line 53)”
[3] " [origin: bad_alloc]"
[1] “error occurred during calling the sampler; sampling not done”

Based on “bad_alloc”, it seems a memory issue, but I am not sure how to deal with this. Any idea on this?

Either you ran out of RAM or something seriously messed up happened.

2 Likes