Prelim proposal for model base class

Bob_Carpenter · October 29, 2018, 10:02pm

Here’s a preliminary spec that is complete up to, but not including getting the constrained type and bounds from a parameter. I’ll work on adding that next.

#include <stan/model/header.hpp>

namespace stan {
namespace model_base {

/**
 * Construct a base model.  This is a no-op.
 */
model_base();

/**
 * Destroy the base model.  This is virtual to allow
 */
virtual ~model_base();

/**
 * Set the data to that described by the var_context.
 * @param[in] data the data reader
 * @throws <exception type> if the data is not sufficient to initialize the model.
 */
virtual void set_data(const var_context& data);

/**
 * Return the log density for the specified transformed parameters.  The
 * transformed parameters are unconstrained by convention in that they have
 * support on all of R^N.
 *
 * For the versions with suffixes _normalize, the normalizing 
 * constants are included.  
 * 
 * For the versions with suffixes _nojacobian, the log Jacobians
 * of the trnasforms are not included in the result.
 * 
 * The parameter T is meant to be a placeholder instantiated by all of
 * double, var, fvar<var>, fvar<fvar<var>>.  This produces a total of 16 functions.
 *
 * @param[in] theta_tr vector of transformed parameters
 * @return log density of the parameters including Jacobian term unless _nojacobian
 * and including normalizing constants in sampling statements if _normalize is 
 * specified.
 * @throw <errortype> if there is an exception thrown by an expression in the log 
 * density
 */
// T : {  };
// P : "_normalize", "";     J: "", "_nojacobian"
virtual T log_density(const Matrix<T, -1, 1>& theta_tr) const;
virtual T log_density_normalize(const Matrix<T, -1, 1>& theta_tr) const;
virtual T log_density_nojacobian(const Matrix<T, -1, 1>& theta_tr) const;
virtual T log_density_normalize_nojacobian(const Matrix<T, -1, 1>& theta_tr) const;

/**
 * Transform the variables defined in the specified var_context by writing them
 * into the transformed parameter vector.  The transofrmed parameter vector will
 * be resized if necessary.
 * 
 * @param[in] theta variable context defining the paramters
 * @param[out] theta_tr vector into which transformed parameters are written
 */
virtual void transform(const var_context& theta, VectorXd& theta_tr) const;

/**
 * Inverse transfrom the specified transformed parameters by writing into the
 * parameter vector, including transformed parameters and generated quantities
 * if specified, and using the specified RNG for generated quantities.
 * 
 * @param[in] rng RNG to use for generated quantities
 * @param[in] theta_tr transformed parameters to inverse transform
 * @param[in] tps true if variables defined in the transformed parameters block 
 * should be included
 * @param[in] gqs true if the generated quantities should be included
 * @param[out] theta parameter vector into which the inverse transformed parameters 
 * are written and the variables in the transformed parameter and generated quantities
 * blocks are written if so specified.
 * @throws <exception type> if there is an error in a transform or in generating variables
 * in the transformed parameters block or generated quantities block.
 */
virtual void inv_transform_generate(stan::model::rng& rng,
                                    const VectorXd& theta_tr,
                                    bool tps, bool gqs, VectorXd& theta) const;

/**
 * Return the name of the model.
 *
 * @return name of the model.
 */
virtual string name() const;

/**
 * Return the number of parameters in the model.
 *
 * @return number of parameters
 */
virtual size_t num_params() const;

/**
 * Return the number of transformed parameters in the model.
 * 
 * @return number of transformed parameters.
 */
virtual size_t num_transformed_params() const;

/**
 * Return a vector of the base sizes of each parameter (not the dimensions).
 * 
 * @return sizes of the parameters
 */
virtual vector<size_t> param_sizes() const;

/**
 * Return a vector of the base sizes of each transformed parameter (not the 
 * dimensions).
 * 
 * @return sizes of the transformed parameters
 */
virtual vector<size_t> transformed_param_sizes() const;

/**
 * Return the names of the parameters in an order matching the sizes.  There is one
 * entry per named parameter.
 *
 * @return names of the parameters
 */
virtual vector<string> param_names() const;

/**
 * Return the names of the transformed parameters in an order matching the sizes.
 *  There is one entry per named transformed parameter.
 *
 * @return names of the transformed parameters
 */
virtual vector<string> transformed_param_names() const;

/**
 * Return a collection of vectors of parameter indexes.  The indexes are first-index
 * major.  For example, a 2 x 3 matrix would return the sequence of indexes
 *  { {1,1}, {1,2}, {1,3}, {2,1}, {2,2}, {2,3} }.
 *
 * @return collection of parameter indexes
 */
virtual vector<vector<size_t>> param_indexes() const;


/**
 * Return a collection of vectors of transformed parameter indexes.  The indexes are
 * first-index major.  For example, a 2 x 3 matrix would return
 *  { {1,1}, {1,2}, {1,3}, {2,1}, {2,2}, {2,3} }.
 *
 * @return collection of transformed parameter indexes
 */
virtual vector<size_t> transformed_param_indexes() const;

}  // namespace stan
}  // namespace model

Notes and Questions

This assumes that evaluating the log density is a const function—this will have to change if we need to store state for algebraic solvers and Laplace approximations.
Returning indexes will allow us to deal with sparse structures and tuples.
To get current output, e.g., a.1.1, a.1.2, a.2.1, a.2.2 for a 2 x 2 matrix, the parameter names and sizes methods must be used.
The parameter names and indexes could be more efficient if they use iterators with move semantics.
A single templated definition can be generated to which all 16 of the log_density functions delegate. It just won’t be exposed as part of the base model class.
Data can be reset—models are no longer immutable.
The model header is the only include; we want to extend that so there’s a way for users to get their material in the include. Maybe this would work like in Eigen.
The RNG type is fixed, and this assumes a new class, stan::model::rng, which can just be a typedef for our current RNG. This will allow us to remove RNG templating everywhere, not just in model class. This will help with compilation speed.
This design assumes we have a static logger; otherwise, we need to pass a logger as an argument to the log_density functions.
This design does not include a way to get the types of the variables as I’m not sure how to do that. We could return strings like “cov_matrix”, etc. Those types are all known statically. Sizes are only known after data is provided.
This design does not provide a way to get the upper and lower bound constraints on constrained parameters. If this is necessary, the method to produce them will have to take the parameters in, because bounds may depend on parameters.

Bob_Carpenter · November 7, 2018, 5:35pm

This is related to two previous topics:

Is there a proper base class for Stan models?
Stan 3 model concept, pointing to an outdated Wiki page

bgoodri · November 7, 2018, 5:47pm

I think we also need

virtual vector<string> data_names() const;
virtual vector<string> data_types() const;
virtual vector<string> data_sizes() const;

and also for transformed data unless we just want the data ones to cover both blocks.

bgoodri · November 7, 2018, 5:49pm

What does the implementation of

virtual vector<vector<size_t>> param_indexes() const;

do?

Bob_Carpenter · November 8, 2018, 4:21am

bgoodri:

I think we also need
virtual vector<string> data_names() const;
virtual vector<string> data_types() const;
virtual vector<string> data_sizes() const;
and also for transformed data unless we just want the data ones to cover both blocks.

We should probably do all of:

data
transformed data
parameters
transformed parameters
generated quantities

My original plan was to have five flags in the calls, like we do now for the last three in write_array(), but I forgot about everything other than parameters writing it out.

I should’ve included an example! It’s intended to represent the indexes of each element. So if you had

matrix[3, 2] a;
vector[2] b;

then it would return

{1,1}, {1,2}, {1,3}, {2,1}, {2,2}, {2,3}, {1}, {2}

corresponding to

a[1,1], a[1,2], a[1,3], a[2,1], a[2,2], a[2,3], b[1], b[2]

This scheme will work for tuples and ragged arrays. It corresponds to what gets printed out by CmdStan as headers,

a.1.1, a.1.2, a.1.3, a.2.1, a.2.2, a.2.3, b.1, b.2

Topic		Replies	Views
How to apply arbitrary constraints on functions of parameters as bounds of uniform priors in a stan model Modeling techniques , fitting-issues	2	561	January 6, 2019
Feature request: Jacobians of unconstraining transform Modeling	41	3493	December 25, 2016
Constraints on parameters/data General	3	739	December 1, 2020
Vector bounds and sampling error General rstan	5	650	May 6, 2020
Linearly scaled, unconstrained parameters Developers	18	1412	October 22, 2018

Prelim proposal for model base class

Notes and Questions

Related topics