Why are Bayesian Neural Networks multi-modal?


Thanks for the reference. Andrew would love this quote from the slides:

You don’t want to think too hard about your prior. The bad effects of bad priors may be greater in high dimensions.

I’d think that’d make you want to think harder, not less hard!

I couldn’t quite follow the language of the rest of it, but it looks like binary problems and it looks like they did some dimensionality reduction up front, but maybe I misunderstood—there’s about two dozen techniques mentioned in there.

I perhaps overstepped trying to do all 10 digits for MNIST. Binary would be much easier.


It doesn’t matter how hard you think, the second sentence is true anyway :)

Unfortunately they were less open access at that time and it’s not easy to find the more detailed description, but it exists somewhere. Maybe we could get some student to do simple comparison even just with the examples from FBM manual.