Posterior predictive sampling from a user-defined probability distribution

amas0 · October 27, 2020, 8:28pm

I have been working on modeling some financial data with Stan and have settled on fitting a skewed generalized t-distribution to best capture the features of the distribution. I lifted the log-probability of this model from this post from 2017 and was able to successfully fit a model to it.

My question is this: with my fit model, how can I go about sampling from the posterior predictive distribution? Thus far with Stan, I have only fit models that have corresponding built-in _rng functions that makes it simple. As far as I’m aware, there’s no analytic form for the CDF of this distribution, so inverse sampling seems like a no-go. Are there suggestions for efficiently sampling from custom distributions like this within Stan?

Thanks in advance for any responses, I’m fairly new to Stan so any help would be appreciated.

The code used is below - I am still iterating on it, but this should be enough to get the idea of the question I’m asking. Also the model itself is quite slow, so any performance suggestions based on the code below would be appreciated as well!

functions {
    real sgt_lpdf(real x, real mu, real s, real l, real p, real q) {
    // Skewed generalised t
    int N;
    real lz1;
    real lz2;
    real v;
    real m;
    real r;
    real out;
    N = dims(x)[1];
    lz1 = lbeta(1.0/p,q);
    lz2 = lbeta(2.0/p,q-1.0/p);
    v = q^(-1.0/p)*((3*l^2+1)*exp(lbeta(3.0/p,q-2.0/p)-lz1)-4*l^2*exp(lz2-lz1)^2)^(-0.5);
    m = 2*v*s*l*q^(1.0/p)*exp(lz2-lz1);
    out = 0;
    r = x-mu+m;
    if (r<0)
        out = out+log(p)-log(2*v*s*q^(1.0/p)*exp(lz1)*(fabs(r)^p /(q*(v*s)^p*(l*(-1)+1)^p)+1)^(1.0/p+q));
    else
        out = out+log(p)-log(2*v*s*q^(1.0/p)*exp(lz1)*(fabs(r)^p /(q*(v*s)^p*(l*(1)+1)^p)+1)^(1.0/p+q));
    return out;
  }
}
data {
    int<lower=0> J;
    vector[J] X;
}
parameters {
    real alpha;
    real<lower=0> sigma;
    real<lower=-1, upper=0> lambda;
    real<lower=1> p;
    real<lower=3.0/p,upper=50*p> q;
    real<lower=-1, upper=1> beta;
}
transformed parameters {
    vector[J-1] mu;
    mu = alpha + beta*X[1:(J-1)];
}
model {
    alpha ~ normal(0, 1);
    sigma ~ normal(0, 1);
    p ~ normal(2, 1);
    q ~ normal(2, 1);
    lambda ~ normal(0, 1);
    beta ~ normal(0, 1);
    for (j in 2:J) {
        X[j] ~ sgt(mu[j-1], sigma, lambda, p, q);
    }
}

bgoodri · October 27, 2020, 9:11pm

This is not easy to do in Stan. A quick search yielded

https://scholar.google.com/scholar?cluster=7282994274836169847&hl=en&as_sdt=0,33&sciodt=0,33

which provides a mixture representation in its equation (4), but you would need a way to draw from a skewed exponential power distribution and a generalized gamma distribution (neither of which are in Stan).

amas0 · October 27, 2020, 10:40pm

Thanks for the response, @bgoodri.

Probably stepping beyond the scope of this forum, but I imagine my best bet from here would be to use my Stan model to generate posterior draws of my parameters. Then either rolling my own or using an existing sampling library, generate draws from an SGT distribution using my posterior parameters to get my posterior predictive draws. Does that seem like a reasonable approach?

bgoodri · October 27, 2020, 10:46pm

Yeah, there is an R package that has a rsgt function you could use that utilizes the inverse CDF via rootfinding.

Topic		Replies	Views
Skewed distribution Modeling	25	4844	June 21, 2021
Skew generalized t distribution Modeling helpful_functions	2	1235	November 21, 2021
Modelling a multivariate skewed t-distribution Modeling	3	822	March 19, 2018
Specify generalized skew t-distribution prior in brms Modeling brms	1	511	February 16, 2023
Skew-t Distribution LPDF Modeling	2	620	January 6, 2022

Posterior predictive sampling from a user-defined probability distribution

Related topics