Conditional Random Fields in Stan?

Is it a “good idea” to try to “fit” CRMs with Stan ?

I did a google search and got not much info.

It gives me the feeling that it is not such a great idea which makes me wonder why.

In the title you say “CRF” and in the body you say “CRM”, but you haven’t defined either, so it is difficult for anyone to provide help.

:( yes I agree, terribly sorry . It was a typo and more. CRF = conditional random field :

my first guess is that it should be “doable” with Stan (as it is “just” a simple undiricted graphical model, without much hierarchy). My physics intuition tells me that it is very vaguely similar to an Ising model where X is the external magnetic field, and MCMC has been pretty much invented for simulating these kind interacting particle/spin systems.

Just took the liberty to edit the title for clarity. Hope that’s OK.

1 Like

I’ve fitted a bunch of things recently with sequential dependencies and my small sample experience is that it’s quite easy to end up with a joint probability which is discontinuous and/or not very smooth. This often results in multiple chains failing to converge.

I’m planning to fit a continuous version of the Pott’s model soon for some not-so-small data; so I’ll report how that goes if it’s of interest.


Yes. CRFs are to logistic regression as HMMs are to naive Bayes. You should be able to fit them in Stan using the forward algorithm in the same way as HMMs. It won’t be as efficient as a custom solution that pickes a point estimate, but the results should be better.

1 Like

Thank you Bob, I spread the word.



You mean that different chains converge to different results ?

In the 2-D Ising model this would correspond to the ferromagnetic phase at low temperatures (if the interaction strength between spins is high). AFAIK k_{b}T=1 in Stan’s version of HMC (someone correct me if I am wrong). So if the spin-spin coupling constant is (much) stronger than 1 then the Ising model will be ferromagnetic, so that will break ergodicity which will result in chains not converging to the same solution. One chain explores one part of the phase-space, other chain explores the other part.

This may be obvious to you, but just in case, I thought I will mention it. It was not obvious to me immediately when I first looked at Stan.

If you are planning to fit the Pott’s model then this angle might be worth thinking about a bit.

Phase transition can take place in Pott’s model too, so different chains may converge to different results (explore different parts of the phase space), depending on the spin-spin coupling constant’s relative strength to k_b T, which is 1 in Stan, if I understand correctly.

1 Like