Declaring an argument data only allows type inference to proceed in the body of the function so that, for example, the variable may be used as a data-only argument to a built-in function.
But, once the Stan model is written and compiles, the type of all parameters is known, isn’t it? Does omitting the data qualifier then (potentially) cost performance?
Yep, the data qualifier determines whether the parameter is transpiled to a double or a var<double>, which then determines whether gradients need to be calculated during the function call. So a data real will be more performant than a real if you don’t need the gradients
There will be a performance difference in the case where you generate local variables from expressions that use only data inputs AND you use the experimental optimization. In this case the AD optimization is triggered (this used to be part of O1 but was later moved to experimental).
function(data real a, data real b) {
real c = a + b;
...
}
without any optimization, c is regarded as a “parameter” - the gradient will be computed wrt to the intermediate variable. Though this case can probably always be avoided by precomputing the data-only expressions, which is probably more performant anyways.