Vectorization in custom families

I have coding question related to writing custom distribution families.

I’m working on a custom family for the Negative Binomial Distribution Type P. I’m using the lognormal_natural.R (written by @StaffanBetner ) as a guide since I’m in personally uncharted territory . Within the code for log_lik_lognormal_natural()

log_lik_lognormal_natural <- function(i, prep) {
  mu <- prep$dpars$mu[, i]
  if(NCOL(prep$dpars$sigma)==1){
    sigma <- prep$dpars$sigma
  } else {
    sigma <- prep$dpars$sigma[, i]   ## [, i] if sigma is modelled, without otherwise
  }
  y <- prep$data$Y[i]
  common_term = log(1+sigma^2/mu^2)
  Vectorize(dlnorm)(y, log(mu)-common_term/2, sqrt(common_term), log = TRUE)
}

we see the Vectorize() function is used at the very end. However, I’m confused as to what this is actually doing since

  1. it seems like only scalars are being passed to the function dlnorm().
  2. Why vectorize an child function each time you call it in the parent function. Wouldn’t make more sense to vectorize log_lik_lognormal

If someone could clarify my confusion, I’d greatly appreciate it.

Thanks,

Mike

  • Operating System: Ubuntu 22.04
  • brms Version: 2.19
  • R Version: 4.3

edited by @jsocolar for syntax highlighting.

Probably a mistake from my side. It should be rewritten. I am a bit unsure how.

Hopefully someone more knowledgeable will pipe in.

R’s dlnorm is already vectorized, so there is no need for a call to Vectorize here. If dlnorm weren’t vectorized, then what you’d want to do is define some new function like v_dlnorm <- Vectorize(dlnorm) and use that in place of dlnorm in the final line of the log_lik function.

To follow up,

  1. would vectorizing the log_lik_lognormal_natural() function (i.e. use log_lik_lognormal_natural <- function(...){...} %>% Vectorize()) help or actually do anything?
  2. Same as above, but what about if the density function consisted of basic numerical funcdtions like +, *, exp(), and the sort?

This would do something, but it wouldn’t do anything functional because brms is never going to call that function vectorized over its inputs. This is something you can confirm pretty easily for yourself: add an assertion that the length of i is one inside your function, and then use it for post-processing with brms. The returning function inside the log_lik function needs to be vectorized because it needs to vectorize over the rows of the data. But the outer function is not assumed to be vectorized over the posterior iterations i.