Stanc3 type system overhaul

Okay, we ended up having that meeting. To attempt to summarize some of our discussion:

  1. I personally want to see stanc3’s internal representation of distributions and bijectors become a little more formalized and structured; that has its own benefits and might shed light on what working with it would be like in Stan
  2. People had no immediate objections to thinking about a distribution as a function that returns a record containing the relevant functions (rng, lpdf, cdf, ccdf). example: normal(mu, sigma).lpdf or lpdf(normal(mu, sigma)). The first version assumes we end up with some record access syntax like that in Stan; the 2nd could be done with tuples.

I want to take this a step further and think about what user defined gradients and user defined transformations/bijectors would look like as well, since we’d like to re-use patterns as much as possible. Transformations can follow pretty seamlessly the distribution path; to define a transformation, define a function that returns a record with forward, inverse, and log_det_jacobian (or some similar name) for the jacobian adjustment. This requires closures, hopefully soon to come.

User defined gradients COULD follow a similar path, giving a function that returns a record containing a value and derivative functions. @bbbales2 what are your thoughts about this? Then, to construct a user-defined distribution that has custom gradients, you do something like this:

def normal(real mu, real sigma) {
  auto value =  (real y) { 1/sqrt(2*pi... } ;
  auto derivative = (real y) { { (y - mu) / sigma, (mu - y) / sigma, ... } )
  return {lpdf= {value= value, derivative=derivative}};

I think this is kinda bloated syntactically. We’re in a DSL, after all. And we’ve already made a decision to use string suffices to group things before - namely _lpdf, _rng, _cdf. We could pull that back in Stan 3, or we could lean in, adding something like _grad for user-defined gradients and _forward, _inverse, and _log_det_jacobian for transformations. And then we can represent this stuff internally in the MIR as nice nested records or something.

I’m going to post a separate thread about syntax for user defined gradients and user defined transformations…