I am currently working on a process model. My current problem (one of several) is that I would like to replace the current (reasonably well-behaved) Gaussian innoviation process with a student process, but I am having problems getting this to work.

My issue is scaling. I can generate a well-behaved standard stream of draws from student(nu) easily (using the gamma(nu/2, nu/2) / normal two step method), but if I get very bad behaviour when I introduce a scaling factor (I have tried the obvious tricks, like normalising the variance of the process, but that just seem to make things worse).

I am sure I am doing this wrong, but it is not clear to me what the right way would be. I am also confident this is a known problem, so I was wondering if anyone can point me in the direction of a known solution?

Is certainly pertinent (!), not the current problem though, which seems to be an interaction between the scaling factor and the degrees of freedom (unsurprisingly, there is - at least in some versions - a Cauchy distribution connected to the digrees of freedom, but the problem seems to be more direct than that.

Currently looking at Fonseca et al. (ArXiv: 1910.01398v1) to see if that might be useful.

Specifically, here’s a repo that implements some (non-Stan!) special cases that are multi-modal in various ways. Similar-ish special case to what you’re talking about:

with t innovations and normal observation error you get 1-3 peaks that can be widely separated and that’s conditional on the previous and subsequent observation. With less information you sometimes get more modes.

Thanks - this looks possibly interestingy relevant. I am coming to the conclusion that multimodality is a (the?) problem. Can’t copy across the model, it is behind a pretty robust security wall, but essentially it uses an underlying process with drift which generates draws from a negative binomial process.

It works well enough (though not perfectly) with Gaussian innovations, but it really complains with student innovations.