is what Metropolis doing, this is the Canonial ensemble.
Leapfrog, simulates the microcanonical ensemble. In that case the
kT does matter. So the total energy is determined by kT. So HMC can
only work if the MC (monte carlo) and MD (molecular dynamics) have the
same average energy, which depends on the temperature.
So this makes me wonder, is there a parameter where the temperature can be set (in Stan) ? Because the likelyhood is “just the energy”. Or is it so that kT is set to 1 in Stan ? Who set kT ? How is kT set ?
Why am I asking this ? Setting a high temperature will cause  for example  in mixture models “chain jumping”. The HMC evolution will explore all parts of the
phase space. Since there will be enough kinetic energy to overcome the potential
energy barriers which are given by the log likelyhood.
Is this discussed somewhere ?
The funny thing is, that since kT is constant, the sampled distribution, in principle is independent of T. However, for example, for the Ising model (which can be seen as a
mixture model super complicated mixture model, experiences phase transition at some Temperature.
Now the question is 

which statiscial models exhibit what kind of phase transitions ? (my gut feeling tells me : mixture models are very prone to this), the average value of the mixing coefficient is called “order parameter” in the phase transition terminology, below a certain temperature, the expectation of this order parameter becomes nonzero.

So, if one can find an order parameter in a statistical system, which becomes continously nonzero, then that is likely to be a second order phase transition.
Why is this so interesting ? Phase transitions have a long history, statistical models
can be categorized according to universality classes. The convergence close
the phase transition becomes critically slow (called critical slow down). Fluctuations
become macroscopic (called critical opalescencia).
I reall wonder if these dead simple, basic, statistical physical aspects have been
considered / used when using Stan for modelling statistical systems.
I have to admit I am no Stan expert, and maybe this has been already considered
in great detail (which I would naturally assume, since this is at the very hart of HMC)  without this HMC does not work.
So I do assume that setting the temperature in Stan is a basic and very important
functionality, however, after reading some basic intro to Stan I have not yet came across on how to set the temperature. Am I missing something ?
What I am saying here is that setting the log likelyhood sets the TOTAL energy.
Setting the temperature, sets the total energy for the integration for the MD part (kinetic energy distribution).
However, what is interesting is that the resulting sampled distribution does not depend on T  if one waits until the end of the universe and longer.
Still, looking at physical system, once can see that the sampled distribution depends
on the temperature because at higher temperature the warm up takes place faster.
Here is my ego tripp : nature.pdf (1.3 MB) .
The importance of temperature for me it is obvious because I spent 8 years of doing MC and MD  separately.
Am I missing some point here ? Somehow I have the feeling that temperature is key.
Where is it defined in Stan ? How can I enter the T as parameter for sampling ? How can I use Stan for simulated annealing ? Can I ? Are there papers on discussing how to deal in Stan with critical slowdown around phase transitions (which lead to super long correlation times => ineffective sampling ) ?
Maybe these are topics that have been beaten already to death, but I am not sure.
Can you please enlighten me ? I have seen (and simulated) many phase transitions due to temperature changes. For example here : nature.pdf .