Divergences false alarm in model using unit_vector?


#1

I worked out an example usage case for the unit_vector variable type, but while it seems to sample and recover simulated parameters well, I tend to get divergence warnings. The scenario being modeled is one where you have noisy samples from a sine wave with a known rate but unknown phase, amplitude & noise:

data{
  int n;
  vector[n] y;
  vector[n] x;
  real<lower=0> rate ;
}
parameters{
  unit_vector[2] phase_helper;
  real<lower=0> amplitude;
  real<lower=0> noise;
}
transformed parameters{
  real phase = atan2(phase_helper[1],phase_helper[2]) ;
}
model{
  amplitude ~ weibull(2,2) ; 
  noise ~ normal(0,1) ; 
  y ~ normal( amplitude * sin( x*rate*2*pi()+phase ) , noise ) ;
}

And here’s R code to generate and sample the model:

library(rstan)
rstan_options(auto_write = TRUE)

n = 1e3
x = seq(0,10,length.out = n)
rate = .5
noise = .1
phase = pi/2
amplitude = 1

f = amplitude*sin(x*rate*2*pi+phase)
y = f + rnorm(n,0,noise)
plot(x,y,type='l')
lines(x,f,col='green',lwd=2)

post = rstan::stan(
  file = 'periodic.stan'
  , data = list(
    n = length(y)
    , x = x
    , y = y
    , rate = rate
  )
  , chains = 4
  , cores = 4 #set to the number of physical cores on your system
  , iter = 1e3
)

rstan::traceplot(post)

Examination of the pairs plot (below) reveals that the dependencies are between the phase-related variables, which I think is expected, so possibly the calculation of the divergence warning needs to be tweaked to avoid this false alarm?


Circular examples?
#2

I don’t think it’s ever safe to ignore the divergences.

I’ve had bad luck with the unit_vectors before as well (in my case it was treedepth problems though – not divergences). You might get more information about what’s happening if you look at the unconstrained variables though (might be totally uninformative too haha). Section 19.3 in the 2.17.0 manual shows how to do it manually.

I think it’s fair to say there’s room for improvement in how rotations are handled in Stan.


#3

Those banana shaped dependencies are problematic. There’s no curvature or step size that makes sense there, so everything has to be very small step size to survive.

It seems to be diverging all over the place. Are you sure the atan2 or mean for y aren’t having numerical problems?


#4

There was another thread about atan2 having discontinuous gradient, which might also explain divergences.


#5

I just was surprised to see divergences given that I thought this was the intended use of the unit_vector parameter type.


#6

Since phase_helper[1]^2 + phase_helper[2]^2 = 1 they’ll always trace out an arc of a circle in a pairs plot. Does that tell you anything about the posterior?

This works – make phase a parameter and don’t bound it. It must have a proper prior though, preferably one informative enough to keep the sampler from hopping modes.

data{
  int n;
  vector[n] y;
  vector[n] x;
  real<lower=0> rate ;
}
parameters{
  real phase;
  real<lower=0> amplitude;
  real<lower=0> noise;
}
model{
  phase ~ normal(pi()/2, 0.1);
  amplitude ~ weibull(2,2) ; 
  noise ~ normal(0,1) ; 
  y ~ normal( amplitude * sin( x*rate*2*pi()+phase ) , noise ) ;
}

This converges with no pathologies at all:

It even works if the prior is way off, with some bias in the posterior (input phase was pi):


#7

Unfortunately, for the scenario I imagined applying this (pulse-oximetry amidst high noise) I’m rarely going to be in a circumstance where I have any prior information on phase, since that depends on precisely when the recordings are started.


#8

It is. Parameters near each other in R^2 should be near each other on the circle (but not vice-versa).

We just haven’t experimented with it a lot.