Dear all,
I have used Stan successfully before to do demographics of super-massive black holes over cosmic time. However, I am now stuck on modelling a different problem.
I am asking for helpful pointers on analysis of Poisson time series with randomly variable count rates. This is motivated by high-energy astrophysics observations. Ultimately, I want to build quite complex source variability models. For example, I want to place a prior on a primary random walk to have, say, a power-law power spectrum with index 2 ± 1. Also, I want to add secondary source processes that depend on the first, e.g., a second component is the delayed and time-averaged form of the first, with the parametric response kernel being fitted for simultaneously. (Such echos happen when light is reflected off a distant surface.)
To start simple, lets say I want to model time-binned poisson data sampled from a random walk with a powerlaw PSD. I have looked into a few approaches to go about this.
Approach 1): A simple way is to generate a lot of such time series with different powerlaw indices (with the Timmer&Koenig method), and then compute the likelihood of seeing each data. This falls apart pretty quickly at >~100 data points, because one is unlikely to find an exact match to the data, and the generating method doesn’t seem to easily allow nearby exploration.
Approach 2): Another approach are ARMA methods, or state-based representations in general (Kelly+14 wrote a nice introduction for astronomers). I quite like this, because it is very expressive and allows defining arbitrarily complex state changes. It deals well with data gaps. Because of the Poisson data, I turned away from Kalman filters and looked into particle filters. In classical particle filters, one would at each time step draw new potential random states and resample the particles using their likelihood weights.
Q1: My first Stan question here is whether it is preferable to make these intermediate random variables Stan parameters instead. My suspicion is yes, because otherwise the derivatives are very noisy, and HMC will complain.
My second problem is that the ARMA parameters are a bit unnatural. Based on them, I can compute a PSD (1, 2), but (Q2) how do I best place a powerlaw slope prior on them. I was considering to run Stan with the ARMA parameters and no data and to use the desired prior as the target (compute the PSD and a gaussian likelihood on deviations from a powerlaw) and then use the ARMA parameter posterior mean and covariance as priors for the actual data analysis. This seems quite cumbersome, however.
Approach 3): The third approach I considered are Gaussian processes, and draw a Poisson instance from it. Q3: I believe one can place a powerlaw PSD prior on the covariance kernel, but how does one do this exactly? Specifically, the covariance decline with |r| is not the same as the PSD slope, if I understand correctly, but I couldn’t find the right formulae. I would appreciate pointers.
I did manage to get an AR(1) primary process plus a reprocessing secondary process (called reflection below, modelled as a convolution with a gaussian window) to work. An issue here is that I have to model many states before my time series actually starts to reprocess these earlier states (with the delayed Gaussian convolution). In the code below, I have Nt data points, and Npremodel states before that (lognorm vector contains all the states of the primary process). The convergence on these states is poor, but I guess that is just how it is, because they must be highly correlated but are poorly constrained. The situation is different for the times with observations, where the states are well constrained. I tried to fill in these Npremodel points by running the AR(1) backwards, but it is not performing much better. Q4: Is there a smarter way to go about this?
data {
int Nflux;
int Nchan;
int Nt;
int Npremodel;
int<lower=0> all_src_data[Nt,Nchan];
int<lower=0> bkg_data[Nchan];
real phoindex;
real reflphoindex;
vector[Nchan] bkg_model;
matrix<lower=0>[Nflux,Nchan] RMF;
vector<lower=0>[Nflux] ARF;
vector<lower=0>[Nflux] e_lo;
vector<lower=0>[Nflux] e_hi;
real<lower=0> src_expoarea;
real<lower=0> src_to_bkg_ratio;
real<lower=0> dt;
}
transformed data {
vector<lower=0>[Nflux] e_width;
vector<lower=0>[Nflux] e_mid;
int Ntotalmodel;
matrix<lower=0>[Nchan,Nflux] response;
vector[Nflux] fluxmodel;
vector[Nflux] reflmodel;
vector[Nchan] fluxmodeldet;
vector[Nchan] reflmodeldet;
real max_time_logsigma;
int<lower=0> all_src_data_flat[Nt*Nchan];
// width of delay or drw has to be within observed window.
max_time_logsigma = log(Npremodel * dt / 3) / log(10);
e_mid = (e_hi + e_lo) / 2.0;
e_width = e_hi - e_lo;
Ntotalmodel = Npremodel + Nt;
for (i in 1:Nchan) {
for (j in 1:Nflux) {
response[i,j] = RMF[j,i] * ARF[j] * e_width[j] * src_expoarea;
}
}
for (i in 1:Nflux) {
fluxmodel[i] = e_mid[i]^(-phoindex);
reflmodel[i] = e_mid[i]^(-reflphoindex);
}
// project components through response:
fluxmodeldet = response * fluxmodel;
reflmodeldet = response * reflmodel;
for (i in 1:Nt) {
for (j in 1:Nchan) {
all_src_data_flat[Nchan * (i-1) + j] = all_src_data[i,j];
}
}
}
parameters {
// primary component: powerlaw with damped random walk
real<lower=0,upper=Npremodel*dt> drw_tau;
real drw_mean;
real<lower=-3,upper=max_time_logsigma> drw_logsigma;
vector[Ntotalmodel] lognorm;
// secondary component: powerlaw with gaussian delay response of primary
vector<lower=-10,upper=2>[Nt] logreflnorm;
real<lower=0,upper=Npremodel> delay_mean;
real<lower=0,upper=max_time_logsigma> delay_logstd;
real logreflstrength;
real logbkgnorm;
}
transformed parameters {
real reflstrength;
real bkgnorm;
real delay_std;
real drw_sigma;
vector[Ntotalmodel] norm;
vector[Nt] reflnorm;
vector[Npremodel] reflweight;
real all_src_model_flat[Nt*Nchan];
drw_sigma = pow(10, drw_logsigma);
reflstrength = pow(10, logreflstrength);
bkgnorm = pow(10, logbkgnorm);
delay_std = pow(10, delay_logstd);
norm = exp(lognorm * 2.302585092994046);
// prepare delayed response function (gaussian)
// convention: with delay_mean=0, j=Npremodel should be the maximum
for (j in 1:Npremodel) {
// look back in time j steps:
reflweight[j] = exp(-0.5 * square(((Npremodel - j) * dt - delay_mean) / delay_std));
}
for (i in 1:Nt) {
// compute reflection by summing up contributions from the past
// weighted by delayed response function (gaussian)
reflnorm[i] = dot_product(norm[i:i-1+Npremodel], reflweight);
for (j in 1:Nchan) {
all_src_model_flat[Nchan * (i-1) + j] =
norm[Npremodel+i-1] * fluxmodeldet[j] +
reflstrength * reflnorm[i] * reflmodeldet[j] //+
//bkgnorm * bkg_model
;
}
}
}
model {
lognorm ~ normal(0, 10);
drw_mean ~ normal(0, 3);
drw_tau ~ uniform(1, 1000);
drw_logsigma ~ normal(-1, 0.3);
delay_logstd ~ normal(1, 1);
logreflstrength ~ normal(-3, 3);
// AR(1) model
lognorm[2:Ntotalmodel] ~ normal(lognorm[1:Ntotalmodel-1] - dt / drw_tau * (drw_mean - lognorm[2:Ntotalmodel]), drw_sigma);
// source + background time series
all_src_data_flat ~ poisson(all_src_model_flat);
// background time series
bkg_data ~ poisson(bkgnorm * bkg_model);
}
Cheers,
Johannes