Temperature

image
is what Metropolis doing, this is the Canonial ensemble.

Leap-frog, simulates the micro-canonical ensemble. In that case the
kT does matter. So the total energy is determined by kT. So HMC can
only work if the MC (monte carlo) and MD (molecular dynamics) have the
same average energy, which depends on the temperature.

So this makes me wonder, is there a parameter where the temperature can be set (in Stan) ? Because the likelyhood is “just the energy”. Or is it so that kT is set to 1 in Stan ? Who set kT ? How is kT set ?

Why am I asking this ? Setting a high temperature will cause - for example - in mixture models “chain jumping”. The HMC evolution will explore all parts of the
phase space. Since there will be enough kinetic energy to overcome the potential
energy barriers which are given by the log likelyhood.

Is this discussed somewhere ?

The funny thing is, that since kT is constant, the sampled distribution, in principle is independent of T. However, for example, for the Ising model (which can be seen as a
mixture model super complicated mixture model, experiences phase transition at some Temperature.

Now the question is -

  1. which statiscial models exhibit what kind of phase transitions ? (my gut feeling tells me : mixture models are very prone to this), the average value of the mixing coefficient is called “order parameter” in the phase transition terminology, below a certain temperature, the expectation of this order parameter becomes non-zero.

  2. So, if one can find an order parameter in a statistical system, which becomes continously non-zero, then that is likely to be a second order phase transition.

Why is this so interesting ? Phase transitions have a long history, statistical models
can be categorized according to universality classes. The convergence close
the phase transition becomes critically slow (called critical slow down). Fluctuations
become macroscopic (called critical opalescencia).

I reall wonder if these dead simple, basic, statistical physical aspects have been
considered / used when using Stan for modelling statistical systems.

I have to admit I am no Stan expert, and maybe this has been already considered
in great detail (which I would naturally assume, since this is at the very hart of HMC) - without this HMC does not work.

So I do assume that setting the temperature in Stan is a basic and very important
functionality, however, after reading some basic intro to Stan I have not yet came across on how to set the temperature. Am I missing something ?

What I am saying here is that setting the log likelyhood sets the TOTAL energy.

Setting the temperature, sets the total energy for the integration for the MD part (kinetic energy distribution).

However, what is interesting is that the resulting sampled distribution does not depend on T - if one waits until the end of the universe and longer.

Still, looking at physical system, once can see that the sampled distribution depends
on the temperature because at higher temperature the warm up takes place faster.

Here is my ego tripp : nature.pdf (1.3 MB) .

The importance of temperature for me it is obvious because I spent 8 years of doing MC and MD - separately.

Am I missing some point here ? Somehow I have the feeling that temperature is key.
Where is it defined in Stan ? How can I enter the T as parameter for sampling ? How can I use Stan for simulated annealing ? Can I ? Are there papers on discussing how to deal in Stan with critical slowdown around phase transitions (which lead to super long correlation times => ineffective sampling ) ?

Maybe these are topics that have been beaten already to death, but I am not sure.

Can you please enlighten me ? I have seen (and simulated) many phase transitions due to temperature changes. For example here : nature.pdf .

So, here is an example : 2D ising model, has critical point at :
image

if the temperature (kinetic energy) is set close to this point, then the convergence
will slow down, convergence, warm up will take forever.

I mean, you observe the spins as data and want to estimate J.

This is a specific case where temperature matters.

Is this something that is important ? If not, why not ?

I mean, if the random variables are not independent then it is
pretty likely that phase transitions will occur all over the place.

if one also has external magnetic field :

.

in the mean field approximation
H=\sum{J<\sigma>^2}-\mu\sum{h<\sigma>}, the probability is something like
P ~ e^{\sum{J<\sigma>^2}-\mu\sum{h<\sigma>}}=\sum{h<\sigma>}e^{\sum{J<\sigma>^2}}, which looks like a mixture model to me.

A gaussian “like” mixture model.

Even in this case, there is a phase transition. Here, our model is very simple, I wonder though, is there a critical slowdown even in such simple models too ? (Mean field approx to 2D Ising ?) I don’t know. I am no expert in Bayesian statistics.

If somebody has some thought on this story, I’d very curious to hear about it.

It is just a hunch. Close to the phase transition, the jumping between the up and down spin states happens very slowly, the correlation times are very long, so it will
take a LOOONG time to get effective samples. No matter if you use HMC or MCMC or whatever you use. Critical slowdown is inherent in this system. However, when the task is to estimate J and h then T can be (and should be) chosen not to far from the critical point but also not too close.

Does this make any sense what I am talking about here ?

Links for the formulas : https://en.wikipedia.org/wiki/Ising_model .