Argument problem for map_rect()

I find some difficulty in reshaping a 1-dimensional array (or vector, matrix) into a 1-dimensional array of vectors. Could you give some suggestion to improve below Rstan code?

Here, I’m sampling 4 variables (e.g. S1, S2, S3, S4) for a number of years (total no. of years = nyrs). Given each year, I first sample the mean (and hence the sum) of S1 to S4, which will be randomly split to assign the value of each variable S1 to S4. Please see below code for example.

To speed up the sampling, I used “vectorization” in the sampling statement and then transform via “map_rect”. My problem is that:

The 3rd argument of “mac_rect” requires the input as a 1-dimensional array of vectors. Very much appreciate if you could give some advice for solving this issue.

Thanks very much!

Data input is as follows:
stan_data <- list(
“num_K2” = 3,
“nyrs” = 43,
“maxTAR” = 4,
“LB_bin_2D” = cbind( seq(1, 129, 3), rep(3, 43) )
)

The RStan code is as follows:

functions { 
  vector split_sumTAR (vector raw_vec, vector meanTAR_vec, real[] nSi, int[] LB_bin) {
    int n = LB_bin[2];
    vector[n] seed_vec = segment(raw_vec, LB_bin[1], n);
    return (nSi[1] * rep_vector(meanTAR_vec[1], n+1) .* ( append_row(seed_vec, 1.0) - append_row(0.0, seed_vec)) );
  }  
}

data {
  int <lower=0> num_K2;   
  int <lower=0> nyrs;
  int <lower=0> LB_bin_2D [nyrs, 2];
  real <lower=0.0> maxTAR;
}

transformed data {
  int <lower=0> n_relaTAR_raw = sum( rep_array(nyrs, num_K2) ); 
  real nSi_2D [nyrs, 2] = rep_array(maxTAR, nyrs, 2);

  real logis_mu_meanTAR_1D    [nyrs] = rep_array(0.0, nyrs);  //    mu for meanTAR
  real logis_sigma_meanTAR_1D [nyrs] = rep_array(1.0, nyrs);  // sigma for meanTAR
  
  vector[n_relaTAR_raw] logis_relaTAR_mu_vec    = rep_vector(0.0, n_relaTAR_raw);  // mu for relaTAR
  vector[n_relaTAR_raw] logis_relaTAR_sigma_vec = rep_vector(1.0, n_relaTAR_raw);  // sigma for relaTAR
}

parameters {  
  real meanTAR_1D [nyrs];                 // logistic sampling: mean(annual TAR)
  vector[n_relaTAR_raw] relaTAR_raw_vec;  // logistic sampling: sum(rela_TAR) over 3 heterologous serotypes over each year  
}

transformed parameters {  
  real<lower=0.0, upper=maxTAR> meanTAR_infer_1D [nyrs] = inv_logit( meanTAR_1D );
 
  vector<lower=0.0, upper=maxTAR>[n_relaTAR_raw] TAR_Si_T_1Dvec = map_rect(
    split_sumTAR, 
    relaTAR_raw_vec, 
    meanTAR_infer_1D,  // error here
    nSi_2D, 
    LB_bin_2D
    );  
}

model {
  meanTAR_1D ~ logistic(logis_mu_meanTAR_1D, logis_sigma_meanTAR_1D);  
  relaTAR_raw_vec ~ logistic(logis_relaTAR_mu_vec, logis_relaTAR_sigma_vec);   
}

After running this code, I got the following Error message:

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
third argument to map_rect must be of type vector[] (array of vectors)
  error in 'model1a002fb34210_test' at line 57, column 4
  -------------------------------------------------
    55:     nSi_2D, 
    56:     LB_bin_2D
    57:     );
           ^
    58:   
  -------------------------------------------------

PARSER EXPECTED: ")"
Error in stanc(file = file, model_code = model_code, model_name = model_name,  : 
  failed to parse Stan model 'test' due to the above error.

I think instead of reals you need to use vectors of length 1 here:

vector<lower=0.0, upper=maxTAR>[1] meanTAR_infer_1D [nyrs];
vector<lower=0.0, upper=maxTAR>[n_relaTAR_raw] TAR_Si_T_1Dvec;
real<lower=0.0, upper=maxTAR> meanTAR_infer_1D_real [nyrs] = inv_logit( meanTAR_1D );

for(i in 1:nyrs) {
  meanTAR_infer_1D[i, 1] = meanTAR_infer_1D_real[i];
}

TAR_Si_T_1Dvec = map_rect(
    split_sumTAR, 
    relaTAR_raw_vec, 
    meanTAR_infer_1D,  // error here
    nSi_2D, 
    LB_bin_2D
    );

Can you try that?

Thanks for your suggestion. Yes, for-loop over all “nyrs” can give a 1-dimensional array of vectors.

I’m a litter worried about the speed of using for-loop here, becoz my likelihood function is quite computationally intensive.

I find another way to implement this “mac_rect” (pls see below code). The basic idea is using the 2nd argument of “mac_rect” to pass all vectorized parameters, and the last argument of “mac_rect” to pass indexing parameters. But I’m not sure if it’s faster than for-loop.

Any comment or suggestion?

vector<lower=0.0, upper=1.0>[nyrs] meanTAR_infer_1D = to_vector( inv_logit( meanTAR_1D ) );
vector<lower=0.0, upper=maxTAR>[n_relaTAR_raw] TAR_Si_T_1Dvec;

TAR_Si_T_1Dvec = map_rect(
    split_sumTAR, 
    append_row( relaTAR_raw_vec, meanTAR_infer_1D), 
    phi,  // just a placeholder
    nSi_2D, 
    LB_bin_2D // input data of indexing pars
    );

I’m also wondering if Stan’s array is implemented as a vector of vectors in C++?

Stan for loops translate directly to C++, so the only times you should worry about speed of for loops is if you have shared calculations that could somehow be avoided by using a specialized function.

The most common case of this is specifically in the evaluation of multi_normal_lpdf statements. It probably happens in other lpdf things as well.l

Otherwise vectorized code is mostly for convenience.

Yeah, multidimensional arrays are std::vectors of std::vectors, etc.

Matrices vectors, and row_vectors in Stan are done with Eigen types.