Prelim proposal for model base class


#1

Here’s a preliminary spec that is complete up to, but not including getting the constrained type and bounds from a parameter. I’ll work on adding that next.

#include <stan/model/header.hpp>

namespace stan {
namespace model_base {

/**
 * Construct a base model.  This is a no-op.
 */
model_base();

/**
 * Destroy the base model.  This is virtual to allow
 */
virtual ~model_base();

/**
 * Set the data to that described by the var_context.
 * @param[in] data the data reader
 * @throws <exception type> if the data is not sufficient to initialize the model.
 */
virtual void set_data(const var_context& data);

/**
 * Return the log density for the specified transformed parameters.  The
 * transformed parameters are unconstrained by convention in that they have
 * support on all of R^N.
 *
 * For the versions with suffixes _normalize, the normalizing 
 * constants are included.  
 * 
 * For the versions with suffixes _nojacobian, the log Jacobians
 * of the trnasforms are not included in the result.
 * 
 * The parameter T is meant to be a placeholder instantiated by all of
 * double, var, fvar<var>, fvar<fvar<var>>.  This produces a total of 16 functions.
 *
 * @param[in] theta_tr vector of transformed parameters
 * @return log density of the parameters including Jacobian term unless _nojacobian
 * and including normalizing constants in sampling statements if _normalize is 
 * specified.
 * @throw <errortype> if there is an exception thrown by an expression in the log 
 * density
 */
// T : {  };
// P : "_normalize", "";     J: "", "_nojacobian"
virtual T log_density(const Matrix<T, -1, 1>& theta_tr) const;
virtual T log_density_normalize(const Matrix<T, -1, 1>& theta_tr) const;
virtual T log_density_nojacobian(const Matrix<T, -1, 1>& theta_tr) const;
virtual T log_density_normalize_nojacobian(const Matrix<T, -1, 1>& theta_tr) const;

/**
 * Transform the variables defined in the specified var_context by writing them
 * into the transformed parameter vector.  The transofrmed parameter vector will
 * be resized if necessary.
 * 
 * @param[in] theta variable context defining the paramters
 * @param[out] theta_tr vector into which transformed parameters are written
 */
virtual void transform(const var_context& theta, VectorXd& theta_tr) const;

/**
 * Inverse transfrom the specified transformed parameters by writing into the
 * parameter vector, including transformed parameters and generated quantities
 * if specified, and using the specified RNG for generated quantities.
 * 
 * @param[in] rng RNG to use for generated quantities
 * @param[in] theta_tr transformed parameters to inverse transform
 * @param[in] tps true if variables defined in the transformed parameters block 
 * should be included
 * @param[in] gqs true if the generated quantities should be included
 * @param[out] theta parameter vector into which the inverse transformed parameters 
 * are written and the variables in the transformed parameter and generated quantities
 * blocks are written if so specified.
 * @throws <exception type> if there is an error in a transform or in generating variables
 * in the transformed parameters block or generated quantities block.
 */
virtual void inv_transform_generate(stan::model::rng& rng,
                                    const VectorXd& theta_tr,
                                    bool tps, bool gqs, VectorXd& theta) const;

/**
 * Return the name of the model.
 *
 * @return name of the model.
 */
virtual string name() const;

/**
 * Return the number of parameters in the model.
 *
 * @return number of parameters
 */
virtual size_t num_params() const;

/**
 * Return the number of transformed parameters in the model.
 * 
 * @return number of transformed parameters.
 */
virtual size_t num_transformed_params() const;

/**
 * Return a vector of the base sizes of each parameter (not the dimensions).
 * 
 * @return sizes of the parameters
 */
virtual vector<size_t> param_sizes() const;

/**
 * Return a vector of the base sizes of each transformed parameter (not the 
 * dimensions).
 * 
 * @return sizes of the transformed parameters
 */
virtual vector<size_t> transformed_param_sizes() const;

/**
 * Return the names of the parameters in an order matching the sizes.  There is one
 * entry per named parameter.
 *
 * @return names of the parameters
 */
virtual vector<string> param_names() const;

/**
 * Return the names of the transformed parameters in an order matching the sizes.
 *  There is one entry per named transformed parameter.
 *
 * @return names of the transformed parameters
 */
virtual vector<string> transformed_param_names() const;

/**
 * Return a collection of vectors of parameter indexes.  The indexes are first-index
 * major.  For example, a 2 x 3 matrix would return the sequence of indexes
 *  { {1,1}, {1,2}, {1,3}, {2,1}, {2,2}, {2,3} }.
 *
 * @return collection of parameter indexes
 */
virtual vector<vector<size_t>> param_indexes() const;


/**
 * Return a collection of vectors of transformed parameter indexes.  The indexes are
 * first-index major.  For example, a 2 x 3 matrix would return
 *  { {1,1}, {1,2}, {1,3}, {2,1}, {2,2}, {2,3} }.
 *
 * @return collection of transformed parameter indexes
 */
virtual vector<size_t> transformed_param_indexes() const;

}  // namespace stan
}  // namespace model

Notes and Questions

  • This assumes that evaluating the log density is a const function—this will have to change if we need to store state for algebraic solvers and Laplace approximations.

  • Returning indexes will allow us to deal with sparse structures and tuples.

  • To get current output, e.g., a.1.1, a.1.2, a.2.1, a.2.2 for a 2 x 2 matrix, the parameter names and sizes methods must be used.

  • The parameter names and indexes could be more efficient if they use iterators with move semantics.

  • A single templated definition can be generated to which all 16 of the log_density functions delegate. It just won’t be exposed as part of the base model class.

  • Data can be reset—models are no longer immutable.

  • The model header is the only include; we want to extend that so there’s a way for users to get their material in the include. Maybe this would work like in Eigen.

  • The RNG type is fixed, and this assumes a new class, stan::model::rng, which can just be a typedef for our current RNG. This will allow us to remove RNG templating everywhere, not just in model class. This will help with compilation speed.

  • This design assumes we have a static logger; otherwise, we need to pass a logger as an argument to the log_density functions.

  • This design does not include a way to get the types of the variables as I’m not sure how to do that. We could return strings like “cov_matrix”, etc. Those types are all known statically. Sizes are only known after data is provided.

  • This design does not provide a way to get the upper and lower bound constraints on constrained parameters. If this is necessary, the method to produce them will have to take the parameters in, because bounds may depend on parameters.


Is there a proper base class for stan_model?
#2

This is related to two previous topics:


#3

I think we also need

virtual vector<string> data_names() const;
virtual vector<string> data_types() const;
virtual vector<string> data_sizes() const;

and also for transformed data unless we just want the data ones to cover both blocks.


#4

What does the implementation of

virtual vector<vector<size_t>> param_indexes() const;

do?


#5

We should probably do all of:

  • data
  • transformed data
  • parameters
  • transformed parameters
  • generated quantities

My original plan was to have five flags in the calls, like we do now for the last three in write_array(), but I forgot about everything other than parameters writing it out.

I should’ve included an example! It’s intended to represent the indexes of each element. So if you had

matrix[3, 2] a;
vector[2] b;

then it would return

{1,1}, {1,2}, {1,3}, {2,1}, {2,2}, {2,3}, {1}, {2}

corresponding to

a[1,1], a[1,2], a[1,3], a[2,1], a[2,2], a[2,3], b[1], b[2]

This scheme will work for tuples and ragged arrays. It corresponds to what gets printed out by CmdStan as headers,

a.1.1, a.1.2, a.1.3, a.2.1, a.2.2, a.2.3, b.1, b.2