I’ve been struggling with modelling simple periodic functions, discovering that the likelihood topography is massively multimodal in very unintuitive ways. But recently I had the idea to reframe things with a hidden markov model applied to the slope of the function derivative. I’ve had some seeming success with the model below, but I’m having trouble working out how to take the successful model of the derivative and integrate to get back to the original function.
In fact, ultimately I’ll be looking to do a whole lot more modelling of the observations, including things like the observations being the additive combination of a known number of periodic functions with well-separated frequencies, a covariate that additionally influences the observations, etc. So I’m thinking that maybe I’m going to have to abandon use of hmm_marginal anyway, and maybe that’s what I need to do here to do inference on the integrated function?
Here’s an R script to play. It’ll produce data that looks like (dots are data, line is true function):
Instead of using a normal, I think it is better to make a categorical variable using y. I’ve made a noisy measure of the “hidden state” by taking the difference in y and getting the sign.
functions {
int sign(real x){
if (x < 0.0)
return 0;
else
return 1;
}
}
data{
int n ;
vector[n] y ;
}
transformed data {
vector[n - 1] y_diff;
for(i_n in 2:n)
y_diff[i_n - 1] = sign( y[i_n] - y[i_n - 1] );
}
In the parameters block I just condensed the pos_slope and neg_slope into one parameter alpha
...
real alpha[2] ; // slope state
...
The rest is taken from this blog where there’s a transformation of the transition probabilities using a 0.5 threshold. These are marginalized over the y_diff states.
transformed parameters {
matrix[2, 2] tps = transpose(append_col(tp_pn,tp_np)) ;
matrix[2, n-1] state_ll ;
real<lower=0, upper=1> theta[2]; // Emission prob
// Compute the log likelihoods in each possible state
theta[1] = 0.5 + (1 - 0.5) * inv_logit(alpha[1]); // Greater than 50%
theta[2] = 0.5 * inv_logit(alpha[2]); // Less than 50%
for(i_n in 2:n) {
for (s in 1:2)
state_ll[s, i_n-1] = log(y_diff[i_n - 1] * theta[s] + (1 - y_diff[i_n - 1]) * (1 - theta[s]));
}
}
Ha, I tried precisely the same thing last night. Or at least with y~normal(f,noise) then dy~hmm(..., sqrt(2*pow(noise,2))), but that ended up functionally making f mostly just tracking y rather being a smooth latent function.
Hm, I think with your formulation, inference on f is indistinguisable from not having any of the HMM structure in there at all. Indeed, commenting out all that structure and re-running yields pretty much identical posteriors on f. But your trick of getting the sign of the difference and doing the HMM on that is inspiring. Unfortunately I’ve consumed my work-allotted time for chasing this idea, but I’ll hopefully make time to return to it soon.
yea I ran it both ways and noticed the output was the same. I know that you were just making a contrived example. Does having the HMM and what I’ve done sample better on your actual problem where not using the HMM doesn’t sample well?
Unfortunately my real data is much noisier, hence the desire to be working with the latent variable f in the HMM part.
I actually came up with an alternative idea that I’ll post as a separate thread in a bit and link back here in case you’re interested. Still only 10% worked out, but “feels” like a less kludgey reformulation of inference for periodic latents than this one.