I’m looking to a continuos bayesian network first mentioned in ‘A continuous variable Bayesian networks model for water quality modeling: A case study of setting nitrogen criterion for small rivers and streams in Ohio, USA’. That doesn’t need to have normality in the data, linear relations and so on. It was implemented in stan?
I just did a search in the paper and it doesn’t look like they used Stan. The algorithm they describe is definitely different from what Stan does.
The regression models listed in section 2.3 don’t jump out to me as impractical to fit in Stan, so it seems like they could be used. What isn’t clear to me is whether the Bayesian updating thing is part of the inference strategy or it has to do with the data collection somehow. I’d guess that would disappear from the equivalent Stan model but I could be wrong.
I review the model in the paper, I think it’s have some weaknesses: high dimensionality, lack of previous knowledge because the regression models have a lot of scientific basis.
But what the Stan does referring to continuous Bayesian models still related to only normal distribution and linear relation between the variables?
Stan works fine non-linear regression and models with non-normal noise assumptions.
Check out the “Regression Models” chapter of the 2.17 manual (https://github.com/stan-dev/stan/releases/download/v2.17.0/stan-reference-2.17.0.pdf). That might give you a clearer idea of the flexibility.
Thanks sir! I Will look into the pdf and try to contact the authors to know more. Besides, about Bayesian models in specifics STAN still use continuous Bayesian network with the difficulties like: the data need to be Gaussian, the arcs are linear models and so on? I’m trying to avoid discretization but I think that is impractical in the present state of art.
Again, thanks for the help!
If I’m interpreting this correctly, Stan wouldn’t need these assumptions. But assumptions like that might make it easier to code a fast Stan model.
What kind of discretization do you mean here? Edit: I’m not very familiar with graphical models – is there a Wikipedia link that describes this? Maybe that would help me answer the question more clearly.
Those two papers:
Show a problem related to discretized bayesian networks, due the fact that summarizing: A discretization defines the model struture/cpt and you need to be very careful to take decisions from those model because of this. I’m working with a small data, high dimensionality, none previous knowledge and probably the data doesn’t fit normal distribution (in my previous analysis they look like exponencial distribution). The first paper that I mencioned first proposed a “perfect” continuos bayesian network without the normal distribuitrion restriction and linear relations between the variables!, but they made I lot of assumptions based in previous knowledge to construct the structural model (DAG), meaning that there isn’t any automatic structural learning algorithm as in discretized bayesian networks (such as score-based and constraints algorithms) and I will not have this knowledge. That’s why I’m asking if I can look to create a continuos bayesian model like that from the paper in STAN.
My goal is to escape discretization.
Again, thanks a lot
Aah, okay, just based on the abstracts my answer is:
If you were building these graphical models in Stan, you’d want to work with whatever regular ol’ conditional probability distributions you used to write out the model.
So no discretizing.
If these conditional distributions are on the data (that you measured) and not the parameters (that you are trying to infer) it seems like maybe you could do this, but then the advantage of the discretization seems less obvious to me unless it represents some sort of censoring/truncation (check “12. Truncated or Censored Data” in the 2.17 manual).
So it’s possible to create a continuous Bayesian network with the characteristics that I mentioned? Can you recommend the chapters to read on Stan pdf?
Haha, well I can’t give you an absolute yes or a no without actually knowing what I’m talking about, but the most likely things that would prevent this from working would be:
It’s possible there are large numbers of parameters that are being marginalized out courtesy those linearity + normality assumptions. If the number is too large, you will probably have trouble working with this model for computational reasons unless you can do all those marginalizations in the Stan model as well.
It’s possible there are problems with the identifiability or specification of the model that NUTS is more sensitive too than other inference methods. The advantage here is that you get a way to diagnose problems with your model! The downside is the models might be broken in some weird way.
Best way to find out how it works is to code something up. Probably the best place to start with Stan is the regression model section in the examples of the manual.
The existence of non-linear transformations and non-normal noise assumptions shouldn’t prevent you from writing the model out though.