Bayesian random forest

avehtari · November 13, 2019, 1:03pm

This and other Wilson’s papers are some of those I meant. In addition I was specifically thinking this one [1809.06452] Robustness Guarantees for Bayesian Inference with Gaussian Processes

linas · November 13, 2019, 7:45pm

Yes, this is true. What about Neal’s work on BNNs? What about Edward? TorchBNN? There should be some utility there?

Fundamentally, posterior looks like depends on the model. I am lost…

Thanks for the references on feature generation.

avehtari · November 14, 2019, 10:24am

Of course there is some utility, I never claimed otherwise. Let me try again. Except for some trivial cases (like NN equivalent to linear or logistic regression) no-one knows how to integrate over the posterior of NN in finite time with controlled integration error. Using MCMC or VI for NN can improve predictive performance compared to, e.g., optimization, even if they are not doing accurate posterior approximations. Thus there is some utility and I’m also fine sometimes using machine learning to get useful predictions, but then we need to be aware of two potential problems: 1) it’s more difficult to know what would happen if we would use more computation time (early stop is also known to be beneficial in machine learning) and 2) it’s impossible to separate the actual model and prior from the implicit prior produced by the biased integration. BNN with MCMC and VI are just like other machine learning algorithms, ie they get inspiration from other fields, but in the end the utility is measured by repeated experiments with different training and test data sets.

linas · November 14, 2019, 6:51pm

What about ARD claimed by Neal? Feature importance could be established by random forest but based on this thread Bayesian Random Forest is even less tractable than BNN. What things one has to keep in mind when making decisions that feature is not important because 95% CR of weight estimate (from input layer to the first hidden layer) is close to 0?

lwiklendt · July 14, 2020, 1:18am

I’m not sure Savage’s comparison between random forests and logistic regression is entirely fair. The random forest was not shown activities with missing user IDs (10% of the data), whereas the logistic regression was given the full data-set. If the random forest was one that also handles missing-data, I wonder how different the prediction accuracy would be.

Topic		Replies	Views
Classification with gaussian process General gaussian-process	22	3351	March 3, 2019
Case study on spatial models for areal data - Poisson CAR/IAR Modeling	116	11186	July 12, 2018
Why are Bayesian Neural Networks multi-modal? General	37	7124	June 27, 2018
Simple model that converges in JAGS but not in Stan Modeling	18	1680	June 9, 2019
Case study: NNGP based models in Stan Modeling	34	3630	January 30, 2018

Bayesian random forest

Related topics