User-defined probability functions with flexible dimensionality of input data

frank_hezemans · January 29, 2022, 10:20pm

I’m writing my own probability function, and I would like this function to be flexible with regard to the dimensionality of the data input argument. Basically, I want the function to mimic the behavior of Stan’s built-in probability functions, where “the pseudotype reals is used to indicate that an argument position may be vectorized”.

In the model block, it would be convenient if my function returns the summed log probability when given a 1-dimensional array of reals as input. However, in the generated quantities block I would like to loop over the array to obtain the log probability of each individual input value.

As a concrete example, consider the following implementation of the inverse Gaussian (a.k.a. Wald) distribution:

functions {
  /**
   * Log density function of the inverse Gaussian (a.k.a. Wald) distribution.
   * 
   * @param y         The input value(s) whose probability needs to be evaluated.
   * @param mu        Mean of the inverse Gaussian distribution.
   * @param lambda    Shape of the inverse Gaussian distribution.
   *
   * @return The log of the inverse Gaussian density of y, given mean mu and shape lambda.
   *
   * @throws If the input value or any of the model parameters are not positive.
   *
   */
  real wald_lpdf(data real[] y, real mu, real lambda) {
    
    // initialise local variables
    vector[num_elements(y)] y_vector;
    real term1; // constant term of log density function (does not depend on input y)
    vector[num_elements(y)] term2; // term of log density function that depends on input y
    vector[num_elements(y)] prob; // density of input y, given model parameters
    
    // check that the model parameters have legal values
    if (is_nan(mu) || is_inf(mu) || mu <= 0) {
      reject("wald_lpdf: mu must be finite and positive; ",
             "found mu = ", mu);
    }
    if (is_nan(lambda) || is_inf(lambda) || lambda <= 0) {
      reject("wald_lpdf: lambda must be finite and positive; ",
             "found lambda = ", lambda);
    }
    
    // convert the one-dimensional array y to a vector, to enable vectorised operations
    y_vector = to_vector(y);
    
    // calculate the probability of input y, given model parameters
    term1 = 0.5 * log(lambda / (2 * pi()));
    term2 = square(y_vector - mu) ./ (y_vector * square(mu));
      
    prob = exp(term1 - 1.5 * log(y_vector) - 0.5 * lambda * term2);
    
    return sum(log(prob));
    
  }
}

data {
  int N;
  real<lower=0> y[N];
}

parameters {
  real<lower=0> mu;
  real<lower=0> lambda;
}

model {
  y ~ wald_lpdf(mu, lambda);
}

generated quantities {
  vector[N] log_lik;
  for (i in 1:N) {
    log_lik[i] = wald_lpdf(y[i] | mu, lambda); // this won't work
  }
}

Is there a way to write the _lpdf function so that it can flexibly handle both real[] y as input (as above in the model block) and real y as input (as above in the generated quantities block)?

Many thanks in advance for your suggestions!

andrjohns · January 30, 2022, 8:30am

Unfortunately with Stan’s strong typing, it’s not possible to define a function as being agnostic between real and real[].

What you could do instead is to pass a one-element array instead of a real for the generated quantities block:

  for (i in 1:N) {
    log_lik[i] = wald_lpdf({y[i]} | mu, lambda);
  }

Topic		Replies	Views
Require real return type for probability functions error Modeling rstan , techniques , specification	4	910	May 29, 2020
Inverse Lognormal CDF Developers	2	344	September 30, 2023
Generating random values from a user-defined distribution Modeling	6	547	July 29, 2019
Outputting lpdf functions as vectors instead of reals Developers	6	934	January 3, 2022
Testing user defined probability function implementations in Stan generated quantities Modeling	3	682	August 23, 2023

User-defined probability functions with flexible dimensionality of input data

Related topics