Computing values on the fly vs a vectorized form gives different results

grburgess · May 18, 2018, 9:33am

Unless I’m introducing a bug, I’m getting different results when I a use a vectorized form vs computing the values on the fly (which is much slower).

The following is the version on the fly:

model {
  real sum_log_prob_tilde;
  vector[N_margin + 1] log_prob_margin;
 

  mu ~ normal(0, 5);
  tau ~ normal(0, 2);
  //sigma ~ normal(0, 2);
  Lambda ~ gamma(10, 0.06);

  // Measurement model for observed objects
  x_latent ~ lognormal(mu, tau);
  x_obs ~ lognormal(log(x_latent), sigma);

  //for (n in 1:N)
  target += log_p_det(x_obs, boundary);

  // (Distinguishiable) Poisson process model
  target += N * log(Lambda);
  target += - Lambda;

  // Measurement model for auxiliary objects

  x_tilde_latent ~ lognormal(mu, tau);
  target += lognormal_lpdf(x_tilde | 0, 1);

  log_prob_margin[1] = 0;
  sum_log_prob_tilde = 0;

// Here is where I compute the value of sum_log_prob_tilde on the fly
  for (n in 1:N_margin) {
    sum_log_prob_tilde += log1m_p_det(x_tilde[n], boundary) 
    + lognormal_lpdf(x_tilde[n] | log(x_tilde_latent[n]), sigma) 
    - lognormal_lpdf(x_tilde[n] | 0, 1); 
   
    log_prob_margin[n + 1] =  n * log(Lambda) - lgamma(n + 1)
      + sum_log_prob_tilde;
  }

  target += log_sum_exp(log_prob_margin);
}
}

This model produces the “right” answer from my simulations.

Now, to speed this up, I can precompute the arguments for sum_log_prob_tilde with a vectorized statement
and then add the values on in the loop. The run goes much faster, but it may be due to something not working out correctly. Indeed, the result is wrong in this case.

The change in the code is above and during the loop:

model {
  real sum_log_prob_tilde;
  vector[N_margin + 1] log_prob_margin;
  vector[N_margin] sum_arguments;

  mu ~ normal(0, 5);
  tau ~ normal(0, 2);
  //sigma ~ normal(0, 2);
  Lambda ~ gamma(10, 0.06);

  // Measurement model for observed objects
  x_latent ~ lognormal(mu, tau);
  x_obs ~ lognormal(log(x_latent), sigma);

  //for (n in 1:N)
  target += log_p_det(x_obs, boundary);

  // (Distinguishiable) Poisson process model
  target += N * log(Lambda);
  target += - Lambda;

  // Measurement model for auxiliary objects

  x_tilde_latent ~ lognormal(mu, tau);
  target += lognormal_lpdf(x_tilde | 0, 1);

  log_prob_margin[1] = 0;
  sum_log_prob_tilde = 0;

// precompute the sum arguments
  sum_arguments = log1m_p_det_vec(x_tilde, boundary)
    + lognormal_lpdf(x_tilde | log(x_tilde_latent), sigma)
    - lognormal_lpdf(x_tilde | 0, 1);



  for (n in 1:N_margin) {
 
// add on the sum arguments at each iteration
    sum_log_prob_tilde += sum_arguments[n];

    log_prob_margin[n + 1] =  n * log(Lambda) - lgamma(n + 1)
      + sum_log_prob_tilde;
  }

  target += log_sum_exp(log_prob_margin);
}

Am I just missing something really dumb? The two codes look equavalent.

Cheers.

bbbales2 · May 18, 2018, 6:59pm

Ahh, yeah, looks the same, but the _lpdfs don’t vectorize like that. For a good explanation of what’s going on, check out “13.5. Vectorizing Mixtures” in the 2.17 manual.

This statement (from the vectorized code):

lognormal_lpdf(x_tilde | log(x_tilde_latent), sigma)

will just return a scalar, \log(\prod_i \text{normal_pdf}(x_i | \hat{x}_i, sigma)), and then that’ll get broadcast to a vector via that addition with log1m_p_det_vec (which presumably returns a vector). You can use prints in your code to verify this is what’s happening.

grburgess · May 18, 2018, 7:46pm

nuts… I was one version behind with my manual. Thanks! It was indeed not computing what I thought.

I will just be stuck with the slow version for the moment.

Topic		Replies	Views
Question about vectorization statement Modeling	3	506	August 21, 2017
Vectorization of target Modeling	1	270	August 10, 2020
Vectorization Problems Modeling techniques , performance	2	424	December 4, 2020
Proper Vectorization of Model Modeling	3	368	May 6, 2019
Vectorizing Mixture Poisson model Modeling	3	415	April 22, 2019

Computing values on the fly vs a vectorized form gives different results

Related Topics