Help with Multinomial Changepoint Model

@weirddatascience: wondering if your goal is similar to the mcp package by @lindeloev?

My use case is for change-point detection in oil and gas production.

1 Like

mcp does not currently handle categorical predictors and multinomial outcomes, but it will at some point! It does support Poisson and Binomial models, if that’s OK for your case. See examples on the mcp website under the “GLM” menu.

mcp also supports Dirichlet priors, but I found that they sample so ineffectively that I came up with a “t-tail” prior that mimics the Dirichlet with a much better sampling efficiency. Read more here. That’s for JAGS, though, and stan may be more effective.

1 Like

Thanks for clarifying 👍

In a current task I noticed that the number of change points one looks for may greatly effect the location of the change points. I.e. picking 1 change point and picking 2 change points may have no overlapping change points.

I think this has implications for your problem; because besides finding a change point, I guess you should maybe also justify why there’s only one… As with mixtures I think it’ll likely be the case that the more change points you pick the better the model will fit; so some kind of regularisation will be necessary.

I had expected that running multiple changepoints could result in totally different locations to a single changepoint. I’d love to see your code for running the two changepoint case if you did it in Stan, though!

In this case I did it in JAGS because it happily handles integer parameters and samples from integer valued distributions so there’s no need to marginalise anything out. Handling the generic case with K change points is also really easy which
helps with model testing.

I’ve never worked with JAGS directly, but I’d love to see a link to a reference example (or code!)

Sure, here you go:

model {
  for (t in 1:T) {
    Y[t]~dpois(lam[t])
    for(i in 2:(K+2)) {
       lp[t,i-1]=ifelse(t>=cp[i-1] && t<cp[i],
                       alp[i-1] + bet[i-1]*t,
		       0)
    }
    log(lam[t])=sum(lp[t,])
  }

  for (i in 1:(K+1)) {
    alp[i]~dnorm(0,1.0E-06)
    bet[i]~dnorm(0,1.0E-06)
  }
  cp[1]=1
  cp[K+2]=T+1
  for (i in 2:(K+1)) {
    cp[i]~dunif(cp[i-1],T)
  }
}

T is the number of points in a time series, K is the number of cut points. In this instance you’re doing a Poisson regression but you can do anything you want using the same structure.

Note JAGS doesn’t work the same way as Stan. What you see above is not an imperative computer programme; it’s actually a specification for a static Bayesian graph used for sampling. Variables don’t need to be declared, you can write stuff in the reverse order that it occurs in the code, the for loops are not really for loops, etc so careful how you interpret it.

It would be delightful if someone could make a Stan equivalent!

PS: mcp package actually uses JAGS under the hood. It’s a way to use mcp to create change point models, but I couldn’t because I had missing values too.

1 Like