Decomposing models, fitting a simpler one and using the fitted simple model to fit the mode complex one

Jozsef_Hegedus · January 28, 2019, 11:06pm

Hi,

I was wondering if the following type of “modus operandi” is OK/common in the Stan / Bayesian community :

Let’s say I want to fit a HMM, but instead I just simply fit a GMM, extract the parameters.

Then using those extracted parameters I do the corresponding HMM but keep the already determined GMM parameters “fixed” => this way I can reduce the computation complexity.

But this this even make sense ? If I remember correctly the author of ScalaStan perhaps was hinting at something similar. Not sure though.

Cheers,

Jozsef

sakrejda · January 28, 2019, 11:30pm

People do this and generally refer to it as a multi-stage procedure or something similar. One problem is that you don’t correctly account for uncertainty in stage 1 when doing stage 2 (or in correlation among stage 1 and stage 2 parameters but when/whether that’s an issue depends on the problem (and sometimes stage 1 has much more data so the uncertainties are tiny compared to stage 2). It’s a great way to start looking at a multi-component problem because you can test out the simpler models before putting them together. Hope that helps.

Jozsef_Hegedus · January 29, 2019, 12:06am

Thanks for the comment @sakrejda . It might be also the case that the composite model is computationally intractable … :( as the complexity grows with power 3 in the number of parameters, you double the nr of parameters you need 8 times more CPU time, and AFAIK Stan ATM is not really parallel.

Nevertheless, putting together stuff from components is a pretty nice idea ! This is why I like the ScalaStan interface. It makes model building compositional, as in, function composition.

Cheers,

Jozsef

sakrejda · January 29, 2019, 12:34am

Stan is parallel and it even makes it useful!

Jozsef_Hegedus · January 29, 2019, 12:41am

Hmm, interesting. Now Greta started to use Tensorflow. I wonder what is parallel where ?

In Stan I can run chains on separate cores. Which is decently parallel I assume, or not ?

So what can Greta get in addition to running chains on different cores from Tensorflow ? I wonder :)

sakrejda · January 29, 2019, 12:46am

In Stan parallelism is within chain, I assume in Greta too. Last I checked Stan ran a larger selection of models because of the functions it can autodidff through. An up to date comparison would be great if you are interested in working out the various capabilities!

Jozsef_Hegedus · January 29, 2019, 1:34am

Well, I don’t really know, but I think with TF it can do more than running different chains on separated CPUs.

Also, this looks kind interesting : https://www.tensorflow.org/probability/api_docs/python/tfp/edward2/Mixture

However, Edward community seems to be pretty dead :(

Topic		Replies	Views
Parallelization in Stan's models General rstan	5	109	May 19, 2025
Seeking expert stan modeler for help speeding up a complex stan model Jobs fitting-issues , specification , performance	4	928	July 29, 2020
Parallel computation Chains over nodes of a cluster. Parallel computation of log_likelihood within a node Modeling	10	1788	October 12, 2018
Bayesian Benchmarking 1.0 General	6	673	July 20, 2021
Presentation on Stan and Factorization Machines Publicity	2	651	January 13, 2020

Decomposing models, fitting a simpler one and using the fitted simple model to fit the mode complex one

Related topics