Var_context, what do we want from it?

sakrejda · March 25, 2018, 2:44am

I was going to write some var_contexts for binary data, I don’t mind working in some other re-writing of var_context stuff with it. So far:

I’d like easy construction from c++
I’d like easy construction from binary files
I’m assuming the var_context API isn’t changing (?)

If I missed some discussion about this I’d appreciate any pointers.

betanalpha · March 25, 2018, 4:37am

The main problem right now is that var_context concept is extraordinarily monolithic and wraps all of the variable transformations. To clean that up we’d need to pull all of the transforms exposed in https://github.com/stan-dev/stan/blob/develop/src/stan/io/reader.hpp and https://github.com/stan-dev/stan/blob/develop/src/stan/io/writer.hpp into their own separate classes that handle the constraining/unconstraining for each transform. For example,

template <typename T, typename T_LB>
class vector_lower_bound: public vector_transform<T> {
private:
  T_LB lower_bound_;

public:
  vector_lower_bound(T_LB lower_bound, size_t N):
    lower_bound_(lower_bound), vector_transform<T>(N) {}

  size_t unconstrained_dim() { return N_; }

  void validate(vector_t constr_val) {
    stan::math::check_greater_or_equal("stan::io::vector_lower_bound",
                                       "Lower bound constraint", constr_val,
                                       lower_bound_);
  }

  template <typename Jacobian>
  vector_t constrain(const vector_t& unconstr_val, lp_accumulator<T_LB>& acc) {
    vector_t constr_val;
    for (idx_t n = 0; n < N_; ++n) {
      if (Jacobian) {
        T_LB lp;
        output(n) = stan::prob::lb_constrain(unconstr_val[n], lower_bound_, lp);
        acc.push_back(lp);
      } else {
        output(n) = stan::prob::lb_constrain(unconstr_val[n], lower_bound);
      }
    }
  }

  vector_t constrain(const vector_t& unconstr_val,
                     std::vector<double> constr_val) {
    for (idx_t n = 0; n < N_; ++n)
      constr_val.push_back(stan::prob::lb_constrain(unconstr_val[n],
                                                    lower_bound_));
  }

  void unconstrain(const vector_t& constr_val, vector_t::InnerIterator& it) {
    for (idx_t n = 0; n < N_; ++n)
      *(it++) = stan::prob::lb_free(constr_val[n], lower_bound_);
  }
};

I already did a bunch of these in the old branch, https://github.com/stan-dev/stan/tree/feature/refactor_io_reader_writer/src/stan/io/transforms.

Once we have all the transformed defined in their own classes we can then clean up the var_context itself. First, a rename – what has been called a “variable context” is more property a “data access layer”. The new DAL would then have a single constrain/unconstrain method that takes in the transform classes, either as a base function or a template. For example,

// base_dal
vector_t get_vector(const std::string& name, size_t N) {
  if (!contains_r(name)) {
    std::stringstream msg;
    msg << "Variable " << name
        << " not found in variable context" << std::endl;
    throw std::runtime_error(msg.str());
  }

  std::vector<size_t> dims = dims_r(name);
  if (dims.size() != 1) {
    std::stringstream msg;
    msg << "Requested the vector " << name
        << ", but " << name
        << " is not defined as a vector in variable context"
        << std::endl;
    throw std::runtime_error(msg.str());
  }

  if (dims[0] != N) {
    std::stringstream msg;
    msg << "Requested the vector " << name
        << " with size " << N << ", but " << name
        << " is defined with size " << dims[0] << " in variable context"
        << std::endl;
    throw std::runtime_error(msg.str());
  }

  std::vector<double> vals = vals_r(name);
  vector_t output(N);
  for (vector_t_idx n = 0; n < N; ++n)
    output(n) = vals[n];
  return output;
}

// constr_dal
template <typename T>
void get_constrained_vector(const string& name,
                            const stan::io::transform& transform,
                            vector_t param) {
  param.resize(transform.size());
  param = get_vector(name, transform.size());
  transform.validate(param);
}

// unconstr_dal
template <typename T>
void get_unconstrained_vector(const string& name,
                              const stan::io::transform& transform,
                              vector_t::InnerIterator& it) {
  vector_t constr_val = get_vector(name, transform.size());
  transform.unconstrain(constr_val, it);
}

Then we can drastically clean up (and simplify) the generated model code to look something like

// constructor
model_name {
  ...
  vector_lb_transform<input_type, lb_type>
    param_name1_transform__(lb_value);
  data_var_context.get_constrained_vector("param_name1",
                                          param_name1_transform,
                                          param_name1__);
  ...
}

// init
void init(vector_t& unconstr_state) {
  vector_t::InnerIterator it(unconstr_state);

  ...
  vector_lb_transform<input_type, lb_type>
    param_name1_transform__(lb_value);
  get_unconstrained_vector("name1", param_name1_transform, it);
  ...
}

// what write_array was
void constr_state(vector_t unconstr_state,
                  std::vector<double>& constr_state) {
  ...
  vector_lb_transform<input_type, lb_type>
    param_name1_transform__(lb_value);
  param_name1_transform__.constrain(unconstr_state, constr_state);
  ...
}

// log_prob
template <typename T>
T log_prob(vector_t& unconstr_state) {
  lp_accumulator<T> lp_acc__;

  ...
  vector_lb_transform<input_type, lb_type>
    param_name1_transform__(lb_value);
  vector_t param_name1__ =
    param_name1_transform__.constrain<Jacobian>(unconstr_state, lp_acc__);
  ...
}

At this point DAL implementations are solely responsible for providing the actual data access. For example, a C++ DAL would be extremely easy to write at this point.

betanalpha · March 25, 2018, 4:38am

Let me also note that none of this redesign is controversial – everyone agreed upon it a few years ago and the spec has been sitting my desktop buried under other issues.

sakrejda · March 25, 2018, 11:40am

Great, I never understood why that all was tied together. Question about the current transforms: there are all sorts of no-op’s sprinkled around, why do we need them?

For example in the reader there’s this one :

      /**
       * Return the next scalar in the sequence, incrementing
       * the specified reference with the log absolute Jacobian determinant.
       *
       * <p>With no transformation, the Jacobian increment is a no-op.
       *
       * <p>See <code>scalar_constrain()</code>.
       *
       * log_prob Reference to log probability variable to increment.
       * @return Next scalar.
       */
      T scalar_constrain(T& /*log_prob*/) {
        return scalar();
      }

sakrejda · March 25, 2018, 11:48am

Nope that’s not a no-op methods, I get it now.

mitzimorris · March 25, 2018, 5:24pm

could you share that spec here? seems like it’s time to re-review this.

Bob_Carpenter · March 25, 2018, 5:55pm

Agreed.

Mine, too.

I put what @betanalpha gave me up on a stan-dev/stan Wiki for the var_context API spec.

syclik · March 27, 2018, 7:24pm

There are two things: design and implementation. I’ll stay out of the current implementation since I don’t really want to dig into the code right now.

I think the redesign isn’t controversial – it just takes work and some coordination.

There’s a use case that pops up now and again that makes the design a little cumbersome. That said, it ties directly into Stan’s type system, so there may not be a better design out there. That use case is streaming data.

The reason this is hard, in my mind, is because the language is really flexible. It’s not like N always means number of observations and we can subsample anything indexed with N. I can’t really imagine a design that would be amenable to Stan’s full expressibility. That said, I bet there’s a way to design something that rstanarm could use to subsample that data in a sensible manner.

Anyway, if I wanted to build a streaming algorithm, I wouldn’t have any problem using the current interface and providing new var_context instances that have chunks of data.

sakrejda · March 27, 2018, 7:46pm

Can you be clearer about this?

syclik · March 27, 2018, 7:55pm

Sure. I’m thinking about Matt’s online LDA paper.

I can imagine there being a really large data set that possibly can’t be stored in memory. I can imagine we can ask for a subset of the data essentially infinitely many times and it’d return a subset of the data with the right sizes. Maybe something more concrete, think about a linear model.

data {
  int N;
  int M;
  real y[N];
  matrix X[N, M];
}
parameters {
  vector[M] beta;
  real<lower = 0> sigma;
}
model {
  y ~ normal(X * beta, sigma);
}

This is way overkill, but suppose M was something like 10 and the real data had N at 1 billion. I could imagine M always being 10, but every time you asked for new data, it’d pull something like N = 1000. If this was some streaming variational approximation, then you use the first data set for a certain number of iterations, draw a new data set and then go on for another few iterations, rinse and repeat until the solution converges (converge meaning stays stable, not meaning gets to the right solution).

That make it any clearer? I could write up something more substantial later if you want.

sakrejda · March 27, 2018, 8:07pm

That’s plenty, thanks. What I don’t see is how the new design interferes with that.

syclik · March 27, 2018, 8:13pm

It doesn’t interfere; it doesn’t make it better or worse.

sakrejda · March 27, 2018, 9:49pm

Just re-read your original post and saw this bit:

I imagine most users who want to use mini-batches would be willing to supply mini-batches but I see where you were coming from now.

syclik · March 27, 2018, 10:46pm

Yup. I had hours of discussions with Alp about this when he was implementing advi.

We can always write a subsampling var context that’s specific to the form of the data and what the unit of observation is. But can’t assume that structure in general. Using it properly within Stan would be a matter of convention. Implementing it wouldn’t be too difficult, but requires config from the interfaces (and a lot of coordination).

sakrejda · March 27, 2018, 11:14pm

It’s total abuse of the context concept to write that, to me it makes more sense for it to take a vector of files in its constructor and move to the next file on request…

Bob_Carpenter · March 28, 2018, 4:19am

Even that seems like something that should be wrapping the var contexts, not be built into them.

sakrejda · March 28, 2018, 4:25am

re: something non-vegan about cats:

github.com

stan-dev/stan/blob/1c2075bf8d0fe563b6907e40220aca016864e816/src/stan/callbacks/tee_writer.hpp

#ifndef STAN_CALLBACKS_TEE_WRITER_HPP
#define STAN_CALLBACKS_TEE_WRITER_HPP

#include <stan/callbacks/writer.hpp>
#include <ostream>
#include <vector>
#include <string>

namespace stan {
  namespace callbacks {

    /**
     * <code>tee_writer</code> is an implementation that writes to
     * two writers.
     *
     * For any call to this writer, it will tee the call to both writers
     * provided in the constructor.
     */
    class tee_writer : public writer {
    public:

This file has been truncated. show original

Bob_Carpenter · March 28, 2018, 4:36am

The tee implements the basic interface given two other implementors. It doesn’t add an option to the basic interface.

sakrejda · March 28, 2018, 4:44am

Yeah, I can see how wrapping would be cleaner. No matter what under the hood you’d be reading from disk/socket to pre-cache some batches and providing them in RAM as needed.

Topic		Replies	Views
Sampling within another C++ program, how to construct `var_context`? General c++api	4	435	November 6, 2022
Metadata methods on the model class for output var type and size information Developers	5	405	July 17, 2019
The var keyword: should Stan support local variable type inference Developers stanc , language	1	526	August 12, 2021
Correct way to add matrix to array_var_context Developers	4	435	September 20, 2018
Composable transforms in Stan Developers	12	818	July 17, 2021

Var_context, what do we want from it?

Related topics