Issues with tuples in cmdstanr

I’ve noticed a few issues with using tuple types in cmdstanr; specifically, there are parts of cmdstanr that don’t seem to understand tuples and can trigger cryptic errors.

Passing tuple data to cmdstan

The cmdstanr:::process_data() function (which takes a named list of data, does some pre-processing/checks, and then runs write_stan_json()) errors when the model contains tuple data. As such, trying to run the sampling() method with a list of data will fail, though it can still work if you give it a properly formatted JSON file. I’ve put together a function that can create JSON data files with tuples, but it’s kind of hacky and I can’t guarantee it will work for everyone. This requires a version of R with native pipe support, though it could be re-written to remove that limitation.

#' Write out the JSON for the model data, with tuple support
#' cmdstanr doesn't currently support tuples, so this is a hack around it
#' @param data named list of data
#' @param compiled_model a compiled `CmdStanModel` object
#' @param json_file name of output json file to create
#' @returns json_file, invisibly.
process_data_tuple = \(data, compiled_model, json_file ) {
  model_variables = compiled_model$variables()
  data_variables = model_variables$data
  # Identify which variables are tuples
  data_type_length = purrr::map(data_variables, 'type') |> lengths()
  tuple_var_idx = data_type_length > 1
  # base variables (non-tuples)
  model_variables_base = local({
    # browser()
    mv = model_variables
    mv$data = mv$data[!tuple_var_idx]
    mv
  })
  data_base = data[names(model_variables_base$data)]
  tmp_json = cmdstanr:::process_data(data_base, model_variables_base)
  json_txt = local({
    # Read in the json, then remove the last brace and add a comma
    txt = readr::read_lines(tmp_json)
    len = length(txt)
    txt = txt[-len]
    txt[len-1] = paste0(txt[len-1], ',')
    txt 
  })
  
  # Now make a stan json for the tuple variables
  tuple_vars = data_variables[tuple_var_idx]
  # Rename all internal tuple nodes to follow 1:n() spec 
  rename_all = \(x) {
    numbered = purrr::set_names(x, seq_along(x))
    # Recurse across sub-tuples
    lsts = purrr::map_lgl(numbered, is.list)
    numbered[lsts] = purrr::map(numbered[lsts], rename_all)
    numbered
  }
  # browser()
  tuple_data = data[names(tuple_vars)] |> map(rename_all)
  
  # tuple_json = tempfile('tmpjson', fileext = '.json')
  tuple_txt = tuple_data |> 
    purrr::lmap(\(x) jsonlite::toJSON(x, auto_unbox = TRUE, 
                               factor = "integer", always_decimal = FALSE, 
                               digits = NA, pretty = TRUE) |> list()) |> unlist() |> 
    stringr::str_sub(3L, -3L) |> stringr::str_replace_all(fixed('\n'), ' ') |>  # Trim off opening/closing brackets
    paste0(c(rep(',', length(tuple_data) - 1), '')) # add comma to the end, except for last one
  # 
  # jsonlite::write_json(tuple_data, auto_unbox = TRUE, 
  #                      factor = "integer", always_decimal = FALSE, 
  #                      digits = NA, pretty = TRUE) 
  # tuple_txt = local({
  #   txt = read_lines(tuple_json)[-1]
  # })
  if(!dir.exists(dirname(json_file))) dir.create(dirname(json_file))
  readr::write_lines(c(json_txt, tuple_txt, "}"),
              json_file)
  invisible(json_file)
}

variable_skeleton() returns NA for tuple parameters

A model that contains the following in the parameters block:

  tuple(real<multiplier=hp_zi_sd>, real<multiplier=hp_zi_sd>) intercept_zi_p;
  real<multiplier=hp_lm_sd> intercept_lm;
  tuple(real<offset=hp_init_mu[1], multiplier=hp_init_sd[1]>,
        real<offset=hp_init_mu[2], multiplier=hp_init_sd[2]>) intercept_init_p;

Produces the following variable skeleton:

pthf$variable_skeleton() |> str()
# List of 3
# $ NA          : num [1(1d)] 0   #should be intercept_zi_p
# $ intercept_lm: num [1(1d)] 0  
# $ NA          : num [1(1d)] 0   #should be intercept_init_p

Unfortunately, I don’t have a workaround for this yet.

This is with cmdstanr 0.7.1 using cmdstan 2.34.1 and R 4.3.1.

Pinging @stevebronder who has also been looking at this recently, as well as the usual suspects @jonah @rok_cesnovar @andrjohns

This may be a good thread to bring up that I’m also having trouble using tuples in partial sum functions. Specifically, if I declare an argument of the partial sum function like tuple(matrix, matrix) var, I get some errors when calling the function with reduce_sum().

Could you share some more code or the errors you’re receiving?

Sure.

This one works, where lambda_mu and lambda_hp are two vectors used in the function:

real partial_sum_lpmf(
    data array[] int seq_region, data int start, data int end, data int n_date_max, data int n_call_max,
    data array[] int n_site, data array[,] int n_date, data array[,,] int n_call, data array[,,] int dates, data array[,,] int tau, data array[,,] vector y, 
    vector eta, vector gamma, vector phi, vector lambda_mu, vector lambda_hp) {

This one doesn’t work, where a tuple lambda_mu is created in the transformed parameters block consisting of two vectors, previously lambda_mu and lambda_hp.

real partial_sum_lpmf(
    data array[] int seq_region, data int start, data int end, data int n_date_max, data int n_call_max,
    data array[] int n_site, data array[,] int n_date, data array[,,] int n_call, data array[,,] int dates, data array[,,] int tau, data array[,,] vector y, 
    vector eta, vector gamma, vector phi, tuple(vector, vector) lambda_mu) {

The errors I get in the latter formulation during compilation are (note I had to omit the first part of the messages because of character limits):

                      ^
/Users/s447341/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/stan/math/prim/functor/reduce_sum.hpp:207:10: note: in instantiation of member function 'stan::math::internal::reduce_sum_impl<occu_ps_ms_trm_tuple_model_namespace::partial_sum_lpmf_rsfunctor__<true>, void, stan::math::var_value<double>, const std::vector<int> &, const int &, const int &, const std::vector<int> &, const std::vector<std::vector<int>> &, const std::vector<std::vector<std::vector<int>>> &, const std::vector<std::vector<std::vector<int>>> &, const std::vector<std::vector<std::vector<int>>> &, const std::vector<std::vector<std::vector<Eigen::Matrix<double, -1, 1>>>> &, Eigen::Matrix<stan::math::var_value<double>, -1, 1> &, Eigen::Matrix<stan::math::var_value<double>, -1, 1> &, Eigen::Matrix<stan::math::var_value<double>, -1, 1> &, std::tuple<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, Eigen::Matrix<stan::math::var_value<double>, -1, 1>> &>::operator()' requested here
  return internal::reduce_sum_impl<ReduceFunction, void, return_type, Vec,
         ^
/var/folders/gr/hv6k6041275bm9w8vvn4ms3sz23ct8/T/Rtmp0Nghm8
/model-12c15458f3352.hpp:2235:12: note: in instantiation of function template specialization 'occu_ps_ms_trm_tuple_model_namespace::occu_ps_ms_trm_tuple_model::log_prob_impl<true, false, Eigen::Matrix<stan::math::var_value<double>, -1, 1>, Eigen::Matrix<int, -1, 1>, nullptr, nullptr, nullptr>' requested here
    return log_prob_impl<propto__, jacobian__>(params_r, params_i, pstream);
           ^
/Users/s447341/.cmdstan/cmdstan-2.33.1/stan/src/stan/model/model_base_crtp.hpp:120:50: note: in instantiation of function template specialization 'occu_ps_ms_trm_tuple_model_namespace::occu_ps_ms_trm_tuple_model::log_prob<true, false, stan::math::var_value<double>>' requested here
    return static_cast<const M*>(this)->template log_prob<true, false>(theta,
                                                 ^
/var/folders/gr/hv6k6041275bm9w8vvn4ms3sz23ct8/T/Rtmp0Nghm8/model-12c15458f3352.hpp:983:3: note: in instantiation of member function 'stan::model::model_base_crtp<occu_ps_ms_trm_tuple_model_namespace::occu_ps_ms_trm_tuple_model>::log_prob_propto' requested here
  ~occu_ps_ms_trm_tuple_model() {}
  ^
/Users/s447341/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/stan/math/rev/core/deep_copy_vars.hpp:33:13: note: candidate function not viable: no known conversion from 'std::tuple<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, Eigen::Matrix<stan::math::var_value<double>, -1, 1>>' to 'const var' (aka 'const var_value<double>') for 1st argument
inline auto deep_copy_vars(const var& arg) {
            ^
/Users/s447341/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/stan/math/rev/core/deep_copy_vars.hpp:23:14: note: candidate template ignored: requirement 'is_arithmetic<std::tuple<Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1>, Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1>>>::value' was not satisfied [with Arith = std::tuple<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, Eigen::Matrix<stan::math::var_value<double>, -1, 1>> &]
inline 
Arith deep_copy_vars(Arith&& arg) {
             ^
/Users/s447341/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/stan/math/rev/core/deep_copy_vars.hpp:45:13: note: candidate template ignored: requirement 'integral_constant<bool, false>::value' was not satisfied [with VarVec = std::tuple<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, Eigen::Matrix<stan::math::var_value<double>, -1, 1>> &]
inline auto deep_copy_vars(VarVec&& arg) {
            ^
/Users/s447341/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/stan/math/rev/core/deep_copy_vars.hpp:63:13: note: candidate template ignored: requirement 'integral_constant<bool, false>::value' was not satisfied [with VecContainer = std::tuple<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, Eigen::Matrix<stan::math::var_value<double>, -1, 1>> &]
inline auto deep_copy_vars(VecContainer&& arg) {
            ^
/Users/s447341/.cmdstan/cmdstan-2.33.1/stan/lib/stan_math/stan/math/rev/core/deep_copy_vars.hpp:79:13: note: candidate template ignored: requirement 'integral_constant<bool, false>::value' was not satisfied [with EigT = std::tuple<Eigen::Matrix<stan::math::var_value<double>, -1, 1>, Eigen::Matrix<stan::math::var_value<double>, -1, 1>> &]
inline auto deep_copy_vars(EigT&& arg) {
            ^

4 errors generated.

make: *** [/var/folders/gr/hv6k6041275bm9w8vvn4ms3sz23ct8/T/Rtmp0Nghm8/model-12c15458f3352] Error 1

Error: An error occured during compilation! See the message above for more information.

Thanks, I’ve opened reduce_sum fails if tuples are passed · Issue #3041 · stan-dev/math · GitHub. If you have more examples or anything they can be posted there. It seems like we’re just missing a specific overload of one of the functions we use internally for reduce_sum

There was also this issue, but I wasn’t able to generate a reproducible example.