Graphical BN to Stan code


#1

Hi,

I am new to Stan and I was wondering if it is possible to translate a graphical BN that describes a specific loss process into Stan code.
The nodes of my BN are various RVs and the arcs are our understanding of the relationship between those RVs.
We are dealing with 10 to 100 nodes and the RVs can be discrete state or continuous state (e.g. $ loss amount).
Nodes (or RVs) not only can be discrete or continuous, but their parent nodes can be also either.
Data is quite sparse and often incomplete. May be we are talking about 20k data point from which 5k or less is complete.
The BN is currently specified and solved using gRain and bnlearn R packages.

Given such a use case, are you aware of anyone who has done such a translation before and if that makes sense in general or is Stan not the right tool for this.

Thanks.


#2

There is no such tool, as far as I am aware. It would be really hard to map a graph in general to the Stan syntax, but harder still to come up with a multivariate likelihood involving discrete and continuous variables. Missingness on discrete variables is also incredibly tedious to deal with in Stan, but we can’t recommend anything else that yields the right answer with sufficiently high probability in general.


#3

I did not mean if there is a tool for the translation, but for me to manually translate my graphical BN (specified in gRain R package) into Stan code.
Are you saying that in Stan it is difficult to work with discrete state random variables?


#4

If by difficult you mean impossible, then yes. They have to be marginalized out. There are examples in the manual but it is tedious.


#5

OK I see so it is not possible to work with a RV which is for example Boolean? I have plenty of those in my BN where each has an associated probability mass function.


#6

If the Boolean variable is latent or partially unknown (i.e. it has missingness), then you have to marginalize it out, which is not so bad because there are only two possibilities. If the Boolean variable is fully observed, then you just have a Bernoulli conditional model for it.


#7

As you can see, it’s very hard to answer these questions in general. Have you looked through the Stan manual to see what it can express? Check out the latent discrete parameters chapter (and missing data chapter if you have missingness). Sometimes the missingness doesn’t even matter, so it will depend on the model.

Are you familiar with BUGS, which makes this translation very easy? (But then it can be very challenging to fit with lots of discrete parameters or if there is high posterior correlation among parameters.)

As @bgoodri pointed out, there’s no way to implement discrete parameters in Stan directly—they have to be marginalized out if you want to work in Stan. The manual chapters I cited show how to do that and why it’s much more effective than sampling them (even if you could do pure MC samples of the discrete parameters).


#8

It is not really the parameters which are discrete, but I would say some of the RV have discrete states and their conditional probability distribution (in the context of the graphical BN) also depends on the state of (parent) RVs whose states can be either continuous or discrete. Is this a far fetch to model with Stan?

Where can I download the Stan manual?

Thank you.


#9

From the Stan web page, users >> documentation. There’s also lots of other doc there.

Not sure what that means. Are there unknowns restricted to a finite or countable set of values?