Hi all,
I am trying to create a likelihood for a Hawkes process. I’ve implemented a very naive likelihood function that is very slow, and was wondering if anyone had any creative ideas how to speed it up.
Likelihood
A Hawkes process is a self-exciting point process (poisson process) whose rate is determined by past events/rates. In general the process looks like this (from wikipedia):
{\displaystyle {\begin{array}{ll}\lambda (t)&=\mu (t)+\int _{-\infty }^{t}\nu (t-s)dN_{s}\\&=\mu (t)+\sum _{T_{k}<t}\nu (t-T_{k})\end{array}}}
where ν : R + → R + is a kernel function which expresses the positive influence of past events T_i on the current value of the rate \lambda(t), \mu(t) is a possibly non-stationary function representing the expected, predictable, or deterministic part of the intensity, and T _i : T_ i < T_ i + 1 ∈ R is the time of occurrence of the i-th event of the process.
Essentially, there is a baseline intensity (first quantity) and an additive intensity based on past events (second quantity). In my case I am modeling a citation process where both baseline and the excitation components decay exponentially. My likelihood looks like this.
\lambda(t)=\alpha \times \exp({-(\delta+\eta \cdot I) \cdot t }) + \beta \times \sum_{c(t=0)}^{c(t=(t-1))} \times \exp{(-(\gamma+\zeta \cdot I)\cdot({t-t_{c(t)})})}
where t is the time of citation c(t) and I is an indicator of membership in a group. \alpha,\beta,\delta,\gamma, \eta and \zeta \sim \text{gamma(1,10)}.
and the standard Poisson process likelihood is:
p(t=i|t={i-1})= \lambda(t=i) \times \exp({-\int_{t={i-1}}^{t=i}{\lambda(t=i)} } )
=\lambda(t=i) \times \exp({-\sum_{t={i-1}}^{t=i}{\lambda(t=i)} })
where i is the current year.
Implementation
Because citations for a given paper can be different lengths (ragged) I have to implement the total number of citations across all papers as a 1d vector c
.
y
is the years when these citations occur (starting at time 0).
s
describes when to partition these ragged vectors into groups.
In the likelihood calculation I iterate across papers in this ragged vector c
and then across years in that paper’s citation sequence. Note that I can do multiplication instead of summation within each year because each event within a year is equivalent. Lastly I am using the rate for each each citation event instead of the event itself to remove stochasticity.
Stan code:
data {
int<lower=0> N; // # observations
int<lower=0> K; // # of groups
vector[N] c; // # citations
vector [K] comm; // # whether belongs to a citation community
int s[K]; // # group sizes
vector[N] y; // # years
}
parameters {
real<lower=0> alpha;
real<lower=0> beta;
real<lower=0> delta;
real<lower=0> gamma;
real<lower=0> mu;
real<lower=0> sigma;
}
transformed parameters {
}
model {
int pos = 1;
alpha ~ gamma(1,10);
beta ~ gamma(1,10);
delta ~ gamma(1,10);
gamma ~ gamma(1,10);
mu ~ gamma(1,10);
sigma ~ gamma(1,10);
for (k in 1:K) { // for each paper in the ragged vector c
int G = s[k]; // G is the length of this paper's citation sequence
real q = comm[k]; // q is the indicator
vector[G] lambda_vec = alpha * exp(-(mu*q+delta)*y[pos:pos+G-1]);
//baseline intensity from times 0...t
for (i in 2:G) {
//additive intensity. This is a for loop across times t from 1...t
lambda_vec[i]+= beta * sum(lambda_vec[1:i-1] .* exp(-(sigma*q+gamma) * (y[pos+i-1]-y[pos:pos+i-2]) ) );
// i use lambda(t) instead of c(t) here to remove stochasticity.
}
target+= sum(c[pos:pos+G-1] .* log(lambda_vec)) -sum(lambda_vec); //likelihood
pos+=G; //iterate to the beginning of the next paper
}
}
Question
It is extremely computationally expensive to iterate across every data point and takes hours even for just 1000 datapoints. Any ideas how to speed this up and do better than O(n^2)?