A big stochastic model in Stan

HeltonMaciel · October 13, 2017, 12:40pm

I’m working with a complex stochastic model that and would like some guidelines on how to tackle it.

1. Model description

The model consists of a set of pdf’s with unknown parameters. The pdf’s are sampled in a controlled way, leading to a bunch of molecules. These molecules are then used to calculate some misture properties that can be compared to some data.

The goal is to sample the pdf’s parameters using stan.

2. Likelihood derivatives

Can HMC handle stochastic likelihoods?

3. External C++ code to calculate likelihood mean

I know there is some guidelines in the documentation, but i couldn’t get my head around it. (i’m not really a programming expert)

Is there a step by step procedure to do this? Or is there anyone that could help me?

I’ve written the model in C++ and is VERY BIG, would be very dificult to translate to the stan language.

appreciate any help.

sakrejda · October 13, 2017, 12:55pm

Do you have a gradient coded in C++ for this very big model?

HeltonMaciel · October 13, 2017, 1:09pm

Actually no. I’m currently using an stochastic optimization algorithm (PSO) for the parameter estimation (MLE)

sakrejda · October 13, 2017, 1:12pm

Hm, well, for HMC you need gradients so one way or another if you want to use HMC you’re going to need to code the gradient of your model.

HeltonMaciel · October 13, 2017, 1:14pm

Well, my model would calculate the mean of a known distribution, the normal for example, and i would let stan handle the gradients. My major concern was the numerical stability, since the mean itself might be random.

sakrejda · October 13, 2017, 1:18pm

Sounds like you want to sample A using your big C++ model and then use Stan to generate B using P(B|A). That’s a procedure you could use and if P(A|B) and P(B|A) are two steps in a valid Gibbs sampler then you might get somewhere. You don’t describe the whole procedure so I can’t tell, but you can certainly use Stan as part of a Gibbs sampler (when you would want to I’m not sure but that’s ok).

HeltonMaciel · October 13, 2017, 1:32pm

Hm, here is some stan code to try to illustrate

data {
//here goes some data
}
parameters {
//some parameters
}
transformed parameters {
mean = f(parameters) //External c++ function
}
model {
y ~ normal(mean,sigma)
}

sakrejda · October 13, 2017, 2:00pm

mean = f(parameters) // External c++ function

If f were written in the Stan language it would calculate both f and df/d(parameters), if your C++ version of f does that then you can proceed in Stan, if it doesn’t you can’t use HMC regardless of the programming environment.

HeltonMaciel · October 13, 2017, 2:09pm

Hm, thanks. I thought i didn’t need to define df/d(parameters). That should be a problem…

Anyway, could you guide me to some examples or tutorials about linking external functions to stan code?

sakrejda · October 13, 2017, 2:15pm

@bgoodri wrote some doc for it but I can’t find it now. Maybe search discourse or wait for him to reply.

yeah, I think people miss the fact that the joy of the Stan language is that you write the posterior function and you get the posterior gradient w.r.t the parameters for free. It removes a huge source of bugs in the use of HMC.

bgoodri · October 13, 2017, 2:17pm

https://cran.r-project.org/web/packages/rstan/vignettes/external.html

HeltonMaciel · October 13, 2017, 2:19pm

Thanks guys!!!

bbbales2 · October 13, 2017, 10:14pm

My major concern was the numerical stability, since the mean itself might be random.

This sounds like you’re in intractable likelihood territory.

f here should be deterministic. If that’s true, you should be okay, but if this is estimated with some Monte Carlo or something, that’s dubious. Even if the expectation exists, I doubt it’d be easy to get enough precision to keep the HMC happy.

transformed parameters {
mean = f(parameters) //External c++ function
}

But ignore this if your function is actually deterministics :D. Good luck!

HeltonMaciel · October 14, 2017, 4:18pm

Bbbales2. You’re actually right!!!. f is calculated through a monte carlo process. For my MLE estimation i need to use stochastic optmization algorithms. So, no HMC (or Stan) for me.

I was wondering if a gibbs sampler would do the trick… what do you guys think?

bbbales2 · October 14, 2017, 5:27pm

For Gibbs sampling you gotta know about your distribution as well.

I think as far as Bayesian inference goes for something like this, it’s ABC. Check out http://elfi.readthedocs.io/en/latest/

https://arxiv.org/abs/1708.00707 is the paper attached to this. Probably has some other stuff in the references you might find interesting.

HeltonMaciel · October 14, 2017, 6:33pm

@bbbales2

Well, i can evaluate the likelihood. What may happen is that for a specific point in the parameter space i have multiple likelihood values due to it’s stochastic nature.

Clearly this is very bad for derivatives, but i thought it might work in a derivative-free algorithm. ( i’m basically doing that when using MLE and particle swarm optimization). What do you think?

Thanks for the tip on ABC, sounds pretty interesting, i’ll definitely check it out.

bbbales2 · October 14, 2017, 9:06pm

What may happen is that for a specific point in the parameter space i have multiple likelihood values due to it’s stochastic nature.

Hmm, I don’t think this is the right way to think about this. Probability densities aren’t random.

If you need to know some probabilities to evaluate other probabilities, that’s the doubly intractable likelihood stuff (mlg.eng.cam.ac.uk/zoubin/papers/doubly_intractable.pdf) (which is something ABC is suited for).

syclik · October 14, 2017, 9:38pm

How do I sticky this?

Bob_Carpenter · October 16, 2017, 3:34am

While we don’t have a way for you to do that built into Stan (there’s now ay to do randomization during the likelihood calculation in Stan), you can do it in an outside function. We use techniques like this for ADVI and in the max marginal mode optimization we’re doing, too. But you’d have to build it on the outside. If you use a mean of a Monte Carlo process, then you can calculate the gradients piecewise.

It is a lot of work, though!

No, but they can require randomization to evaluate. For example, they may be stated in terms of an integral that requires some randomization in the form of (MC)MC methods to evalute in practice. This is what happens when we marginalize.

p(a) = INTEGRAL_B p(a, b) d.b

To evaluate p(a), we have to compute an integral. To evaluate the integral, we have to use Monte Carlo methods. That makes our evaluation of p(a) itself stochastic, even though the density itself isn’t random.

betanalpha · October 16, 2017, 8:15pm

Right, but in that case the joint density that you’re specifying still isn’t random, it’s just the implementation of the marginalization that’s random due to the use of a stochastic algorithm like MCMC. I think that @benbales’s point is the one that we should be reinforcing.

Topic		Replies	Views
Making arbitrary C++ functions available to Stan Modeling	14	2023	June 12, 2018
Using Stan with stand-alone models Modeling techniques	16	1569	August 13, 2018
Custom, MatLab implemented, likelihood function Modeling	2	625	June 7, 2018
Use Stan easily from c++ Interfaces c++api	25	7515	May 16, 2018
Coupling Stan with a Dynamic Vegetation Model Modeling techniques	14	1128	March 21, 2018

A big stochastic model in Stan

Related topics