Help with a change point model

I have a 2D matrix y of dimension M\times N. Given a point in the matrix a,b and a maximum radius r_max , I’d like to make a simple change point model that will divide the area within the maximum radius amongst two probability distributions as shown below by changing the radius.

For some reason NUTS, goes a bit nuts. The majority transitions end up exceeding tree depth and it takes ages to train even with r_max = 20.

Could anyone offer any help regarding how I ought put this kind of model together?

data {
  int M;
  int N;
  int a;
  int b;
  int r_max;
  real y[M,N];
}
parameters {
  real<lower=0,upper=1> theta;
}
model {
  for(i in (a-r_max):(a+r_max)) {
    for(j in (b-r_max):(b+r_max)) {
      if(sqrt((a-i)^2+(b-j)^2)>(r_max*theta)) {
        target+=normal_lpdf(y[i,j]|0,1);
      }
      else {
        target+=log_mix(0.8,normal_lpdf(y[i,j]|7,.1),normal_lpdf(y[i,j]|0,1));
      }
    }
  }
}

Here is some fake data with blob at a=50,b=50

require(MASS)
set.seed(12)
gridsize=100
x=matrix(rnorm(gridsize^2),nrow=gridsize,ncol=gridsize)
x[x<0]=0
sigma1=matrix(c(2.1,-1,-1,1),nrow=2,byrow=T)
sigma2=matrix(c(2.1,1,1,1),nrow=2,byrow=T)
n=50
x[round(mvrnorm(n,c(50,50),sigma2),0)]=7

It looks like this (center pixel is at 50,50):

image

I suspect that if branching on theta makes the model discontinuous. A smoother transition between mixture and single distribution may help.
I’m not super good with R but it looks like your fake data generation draws 50 points from a normal distribution around the center and sets those to 7. That implies the mixture probability is not constant 0.8 inside the “feature radius” but rather drops off gradually.
Maybe something like this would work better.

for(i in (a-r_max):(a+r_max)) {
  for(j in (b-r_max):(b+r_max)) {
    target += log_mix(0.8*exp(-((a-i)^2+(b-j)^2)/(r_max*theta)),
                normal_lpdf(y[i,j]|7,.1),
                normal_lpdf(y[i,j]|0,1));
  }
}
1 Like

Thank you for your reply… You are of course right. The operative observation was that the if branch causes discontinuity and that if it were smoother the problem wouldn’t exist, which made a swath of other things click for me.

For example, @idontgetoutmuch proposes a continuous time change point model here. In the model, i is incremented and \tau is a change point, he engineers a smooth if statement using logit^{-1}(2(i-\tau)). The logit function is small when i-\tau<0 and then suddenly large when i-\tau>0, the scale of the transition is determined by the constant (2 in this case).

With that tweak added the following version is accurate and near instant. Thank you for your help @nhuurre!

data {
  int M;
  int N;
  int a;
  int b;
  int r_max;
  real y[M,N];
}
parameters {
  real<lower=0,upper=1> theta;
}
model {
  for(i in (a-r_max):(a+r_max)) {
    for(j in (b-r_max):(b+r_max)) {
      real d = sqrt((a-i)^2+(b-j)^2);
      if(d>r_max) continue;
      target+=log_mix(inv_logit(20*(d-r_max*theta)),
                      normal_lpdf(y[i,j]|0,1),
                      log_mix(0.8,normal_lpdf(y[i,j]|7,.1),normal_lpdf(y[i,j]|0,1)));
    }
  }
}
1 Like