Defining functions with inputs and outputs whose sizes vary with data

statsfan · September 1, 2022, 9:26pm

Is there now a way to declare functions with inputs and outputs whose dimensions are determined by the data in the functions block? I saw this post from 2017 but don’t know if this is updated Global variable in functions block

As an example of why I’m trying to do this, here I declare new_array (a size x array of integers) and fill it with zeroes with a loop in transformed data.

data{
  int x; 
}

transformed data{
  array[x] int new_array;

  for (i in 1:x) {
    new_array[i] = 0;
  }
}

I’m trying to turn the loop into a function fill_0s(). The function would take an array of integers in_array of arbitrary size and return an object the same size as in_array. Here’s my attempt. The problem is that I’m using x when declaring fill_0s() but x is not known until after the data are read in.

I’m guessing I could move the function to the transformed data block but am wondering if it’s possible in the functions block, perhaps without declaring the size of the array that will be input/output.

functions{
  array[x] int fill_0s(array[x] int in_array){
    int z = size(in_array);
    for (i in 1:z){
      in_array[i] = 0;
    }
    return(in_array);
  }
}

data{
  int x; 
}

transformed data{
  array[x] int new_array;
  new_array = fill_0s(new_array);
}

Bob_Carpenter · September 20, 2022, 9:57pm

Yes, and sorry we didn’t answer this sooner. The functions block you provided won’t parse because function arguments do not get sizes. And you can’t assign to function arguments in Stan—they’re not passed in by reference, so it doesn’t do what you are expecting here judging by your code.

Here’s a function that returns a size N array of zeros, because you really only need the size.

functions {
  array[] int zero_array(int N) {
    array[N] int y = rep_array(0, N);
    return y;
  }

But as you can see, you can just use rep_array(0, N) to create a size N array filled with 0. Here’s an example that calculates a linear predictor of variable size, for example,

vector linear_predictor(vector x, real alpha, real beta) {
  return alpha + beta * x;
}

The size of the return depends on the size of x. You can also do this with a loop in the body less efficiently,

vector linear_predictor(vector x, real alpha, real beta) {
  vector[rows(x)] y_hat;
  for (n in 1:rows(x)) {
    y_hat[n] = alpha + beta * x[n];
  }
  return y_hat;
}

Also, you can do compound declare defines, so rather than

int x
x = ...;

we allow

int x = ...;

I would also suggest declaring with appropriate constraints for error checking,

data {
  int<lower=0> x;
}

and probably not using x for an integer (i through n are much more commonly used for integers).

Topic		Replies	Views
Passing integer array into Stan function General rstan	3	1035	October 3, 2022
Hello i am new user and i need some help! (Stan & Rstan) RStan	1	379	November 20, 2020
Can a function use data? General stanc	2	651	May 30, 2017
Ragged array expressions Modeling	17	1780	October 22, 2018
Array of matrices in Rstan Modeling	3	962	June 4, 2019

Defining functions with inputs and outputs whose sizes vary with data

Related topics