This is maybe a bit tangential (oh boy, that pun wasn’t intended), but I’m looking for some sort of article that discusses the tangent half-angle link function (tan-half) used in brms as the default link for von Mises distributions. In all my searching online I have found exactly one bit of documentation (from the R package circtree) that explains that the link is used to restrict values to the (-\pi, +\pi) range. Are there any other articles (or brms documentation) further discussing this? I’d like to understand how/why this is implemented, why it’s the default link, etc.
Hmm interesting catch. Any chance you could elaborate? Is your idea that this is maybe part of a compute-efficient reparameterization of the von Mises distribution?
My bad, I see now that’s unrelated. The idea behind the half-tan link is that for circular data, the location parameters of the Von Mises wraps around such that a value of -pi implies the same model as as a value +pi (ditto any integer multiple of these values). This can cause sampling issues if the location is left unconstrained because you can always get an equivalent likelihood by stepping 2pi in either direction. So if there’s a mode at say 0.5, there will also be modes at 0.5+2pi, 0.5+4pi, …, 0.5-2pi, 0.5-4pi, … The tan function has the nice property that it transforms an unbounded input to an output bounded to -pi and +pi, so the sampler can explore an unbound parameter whilst yielding a likelihood that doesn’t venture into the repeating territory.
Note that an alternate would be to declare bounds on the location parameter, in which case the sampler would create an unbound variable behind the scenes and apply a transform to yield a parameter that meets the bounds. I imagine however that the “scaled and translated log-odds” transform is pretty similar in input-output mapping to tan, while tan has the nice property of already imposing the bounds of -pi/+pi we need.
I am familiar with that property of von Mises distribution. If I understand your point, then, the problem exists with the idea of brms/stan working on unbounded transformed scale of the DV/outcome variable? And that’s where you can get modes at every multiple of 2pi as it would just keep wrapping around and around the circle? So, the tan-half link bounds the output so it doesn’t run into this issue? I would guess that that’s why I ran into issues when I tried to set the link to identity…
just to mention for anybody who encounters this in the future–I know you know this @mike-lawrence–that this alternative doesn’t generally work in any self-consistent way if the location parameter is getting modeled as a function of covariates. The constraining transform needs to be applied to the linear predictor itself.
If i remember correctly, a disavantage of the tan-half link is that is can make transform an unimodal distribution on the constrained circular space into a (potentially strongly) bimodal distribution in the unconstrained space, if there is relevant posterior mass near the “cutting point” of the link function in constrained space. Even if there is negligible mass there I imagine the sampler could probably get stuck on the wrong side of that edge if there’s a strong gradient there.
Another way to parameterise a circular distribution is to use vectors of higher dimension projected down to the desired dimensionality. This avoids both the bimodality and the unboundedness issue. See e.g. this thread for some discussion about some important technicalities behind that:
I didn’t continue it so far, as I got too busy with other things. I also didn’t hear of anyone else continuing with it.
I could imagine I might pick it up again in autumn/winter.