WARNING !
- “hand waveing arguments”
- “speculations”
- “half awake/asleep/trance state/hot-shower induced intuitions”
- “3rd year, second semester, 2nd lecture” - grade statistical physics concepts
- “1st year, second semester, 5th lecture” - grade Classical Mechanics concepts
- all the “basic ML/CS” PhD level stuff as well, Bishop book and friends
are ahead !!!
This is DANGER ZONE :)
THIS WILL (most likely) make no real SENSE, on PURPOSE.
— YOU HAVE BEEN WARNED :) :)
Recently I came to some “deep” realisation on how “ML”/Bayesian inference is connected to Hamiltonian mechanics ( https://physics.stackexchange.com/questions/89035/whats-the-point-of-hamiltonian-mechanics/477966#477966) . Through phase space / information theory and most importantly INDEPENDENCE.
I am writing this post because the above link seems to confirm my intuition that there might be something really awesome insight lurking here, which is not obvious to me, maybe obvious to some of you ??? I do hope. Hence this post. Please, enlight me :)
Now, this stackoverflow question really makes me obsessed to not let go of the question : what is the most optimal choice for a Hamiltonian (let’s call it H_{MCMC}) used for the “MCMC part” for a system which is described by a “real world Hamiltonian” (let’s call it H_{real-world}). (x_i=1 <== checking Latex compatibility)
Given H_{real-world}, how can I find the most optimal H_{MCMC} that “solves a Bayesian inference problem” on data which was generated by a dynamical system (let’s denote it by S_{real-world}) whose equation of motion is defined by H_{real-world}, and same samples were taken according to the “Ergodicity principle” and / or “replacing ensemble average by time average” concepts.
But for now, let’s stick to the microcanonical (constant energy) ensemble.
I have the “feeling” that knowing the underlying Hamiltonion of the "to be modelled REAL-WORLD system, which is ultimately dynamic in nature ( hence EVERY data is dynamic in nature, no matter if it was generated by a Turing machine or by “the real world” ).
So my feeling is that knowing the equations of motions for the real world problem could provide some hints for “optimal Hamiltonian / sampling / whatnot” for the Hamiltonian used for the actual Stan calculation,where the data is simply datapoints in the phase space of a microcanonical ensemble with N degrees of freedom.
BIMMM !!!
I don’t expect any real answers, just “gut feelings”, “speculations”, “collaborative daydreaming”. Just the typical conference discussion after a few cookies / beers in Amsterdam after the conference dinner.
Out.
J.