User defined functions: data qualifier

I was wondering whether adding/omitting the data qualifier in UDFs changes the (performance of the) generated c++ code?

The documentation says:

Declaring an argument data only allows type inference to proceed in the body of the function so that, for example, the variable may be used as a data-only argument to a built-in function.

But, once the Stan model is written and compiles, the type of all parameters is known, isn’t it? Does omitting the data qualifier then (potentially) cost performance?

Yep, the data qualifier determines whether the parameter is transpiled to a double or a var<double>, which then determines whether gradients need to be calculated during the function call. So a data real will be more performant than a real if you don’t need the gradients

1 Like

I don’t think the data qualifier makes a difference to performance.

real fn(real v, data real d, real v2) {
    real x = v + d;
    return x / v2;
}

generates C++

template <typename T0__, typename T2__,
          stan::require_all_t<stan::is_stan_scalar<T0__>,
                              stan::is_stan_scalar<T2__>>* = nullptr>
  stan::promote_args_t<T0__, T2__>
  fn(const T0__& v, const double& d, const T2__& v2, std::ostream* pstream__) {
    using local_scalar_t__ = stan::promote_args_t<T0__, T2__>;
    int current_statement__ = 0; 
    static constexpr bool propto__ = true;
    (void) propto__;
    local_scalar_t__ DUMMY_VAR__(std::numeric_limits<double>::quiet_NaN());
    (void) DUMMY_VAR__;  // suppress unused var warning
    try {
      local_scalar_t__ x = DUMMY_VAR__;
      current_statement__ = 3;
      x = (v + d);
      current_statement__ = 4;
      return (x / v2);
    } catch (const std::exception& e) {
      stan::lang::rethrow_located(e, locations_array__[current_statement__]);
    }
    }

It doesn’t use var<double> explicitly but every parameter gets a template that can be deduced to be either double or var<double> at the call site.

2 Likes

Ah I see, so performance will be the same, it just enables stricter type-checking

1 Like

There will be a performance difference in the case where you generate local variables from expressions that use only data inputs AND you use the experimental optimization. In this case the AD optimization is triggered (this used to be part of O1 but was later moved to experimental).

function(data real a, data real b) {
    real c = a + b;
    ...
}

without any optimization, c is regarded as a “parameter” - the gradient will be computed wrt to the intermediate variable. Though this case can probably always be avoided by precomputing the data-only expressions, which is probably more performant anyways.

6 Likes

If I could mark two posts as the joint solution, I would. Thanks to all of you!

2 Likes