Vectorising multiple change point model

mchonofsky · July 29, 2019, 10:59am

Hi! I am trying to implement a hierarchical change-point model in stan. At the heart of the stan implementation is the linear-time example given in the Stan reference (https://mc-stan.org/docs/2_18/stan-users-guide/change-point-section.html)

I have a collection of N time-series of length T which are divided into two classes. I have a list of ints called classes which allows me to label each of the N time-series as an element of one of the two classes. The two classes have different breakpoints but they have the same early and late rates. My transformed parameters block:

matrix[T, k] lp;                                                                                
matrix[T + 1, k] lp_e;                                                                          
matrix[T + 1, k] lp_l;                                                                          
matrix[T + 1, k] prior_s;                                                                       
                                                                                                
for (i in 1:k)                                                                                  
{                                                                                               
    lp_e[1, i] = 0;                                                                             
    lp_l[1, i] = 0;                                                                             
    prior_s[1, i] = normal_lpdf(1 | mu, sigma);                                                 
}                                                                                               
                                                                                                
for (t in 1:T) {                                                                                
    for (i in 1:k) {                                                                            
        prior_s[t + 1, i]  =  normal_lpdf(t | mu, sigma);                                       
        lp_e[t+1, i]       =  lp_e[t, i];                                                       
        lp_l[t+1, i]       =  lp_l[t, i];                                                       
    }                                                                                           
    for (n in 1:N) {                                                                            
        lp_e[t+1, classes[n]] += normal_lpdf(D[n, t] | e, sigma_data);
        lp_l[t+1, classes[n]] += normal_lpdf(D[n, t] | l, sigma_data);
    }                                                                                           
}                                                                                               
                                                                                                
for (i in 1:k) {                                                                                
    lp[:,i] = rep_vector(lp_l[T + 1, i], T )                                                    
                + head(lp_e[:, i], T) - head(lp_l[:, i], T) + prior_s[1:T, i];
}

I suspect the 1…N loop is slowing things down and I’d like to vectorise it. I will eventually have many more than two classes and N will be in the tens of thousands (at least). Any ideas on how to write an autodiff-friendly implementation? Many thanks!

Bob_Carpenter · August 2, 2019, 3:51pm

I’m going to move this to the modeling category.

There’s no a good way to vectorize that (n in 1:N) loop, I’m afraid, as it indexes by n into the left-hand side. Our normal_lpdf vectorizations on the other hand reduce by summing, which is the usual use case for densities.

On the other hand, you should be able to vetorize the lp_e and lp_l assignments in the first loop,

for (t in 1:T) {
  lp_e[t + 1] = lp_e[t];
  lp_l[t + 1] = lp_l[t];
}

Is normal_lpdf(1 | mu, sigma); really supposed to have a constant 1 data for everything?

Given that mu and sigma don’t chane in this loop,

for (i in 1:k)                                                                                  
{                                                                                               
    lp_e[1, i] = 0;                                                                             
    lp_l[1, i] = 0;                                                                             
    prior_s[1, i] = normal_lpdf(1 | mu, sigma);                                                 
}

This could just be:

transformed data {
  vector[k] k_zeros = rep_vector(0, k);
...
  lp_e[1] = k_zeros;
  lp_l[1] = k_zeros;
  prior_s[1] = rep_vector(normal_lpdf(1 | mu, sigma), k);

This is helpful as it avoids evaluating normal_lpdf k times. the same thing can happen in the other loop:

for (t in 1:T) {
  prior_s[t + 1] = rep_vector(normal_lpdf(t | mu, sigma), k);
  lp_e[t + 1] = lp_e[t];
  lp_l[t + 1] = lp_l[t];
}

This will be more efficient if rather than a matrix, you use an array of vectors, as the memory locality will be better (matrices are organized in memory by column).

Then the head operations can be simplified from head(lp_e[:, i], T) to lp_e[1:T, i], which will save a whole bunch of copying.

mchonofsky · August 3, 2019, 9:46pm

Thank you for the hints, even though the problem as stated is insoluble (it seems). I’ll incorporate those into my code and see how much it changes things - or try a different approach.

Topic		Replies	Views
Simpler Change Point Models? Project Proposals	13	3552	February 14, 2020
Multiple Change Point Detection Modeling rstan , techniques , fitting-issues , specification , performance	2	820	September 21, 2022
Hierarchical Piecewise Linear Model (Extending Facebook Prophet forecasting tool) Modeling fitting-issues , specification , piecewise	1	714	April 14, 2020
Increasing Stan efficiency by vectorizing for loop Modeling	6	650	October 9, 2022
Identify change points in multiple time-series Modeling techniques	1	538	May 30, 2019

Vectorising multiple change point model

Related topics