Binary time series

Hi,

I have a question about binary time series modeling. I have longitudinal data (binary credit risk) for a number of individuals. I have tried different GLM models but in vain. I would like to try modeling by using Gaussian Process. Unfortunately the example in stan guide here contains an example where time series is not considered explicitly.

I stumbled upon the sentence:

The Gaussian processes chapter presents Gaussian processes, which may also be used for time-series (and spatial) data.

What is implied by this sentence?

A simple approach would be adding time as covariate. I wonder if there are approaches to deal with time explicitly? A complicated factor is that there is periodicity (incentive is applied in a given quarter, it is applied every second quarter for all individuals) so I thought about one GP when incentive is not applied (baseline GP) and two GPs when incentive is applied (baseline+incentive GP).

Thanks for any advice

If you have BDA3 there’s a great example of the birthday vs holidays time series analysis . It showcases using the additive property of GPs to decompose a time series into all sorts of effects.

Theres a less technical discussion and non Stan implementation on Gelman’s site:

https://statmodeling.stat.columbia.edu/2016/05/18/birthday-analysis-friday-the-13th-update/

Awesome. Do you have any reverences where outcome is binary (yes/no). The example is for discrete outcome.

Usually with time series analysis I like to start pretty simple. Running a glm/er in either stan or brms (with stancode dump). So it’s something like credit_risk ~ x1 + x2 + xn + time + (1|time). Once I have a handle on the data and model I’d move into brms and run credit_risk ~ x1 + x2 + xn + s(year, bs=‘gp’, m=1). Again getting a feel for if it’s close.

After that coding up some simple state space models. My fav starting points are here https://github.com/sinhrks/stan-statespace

1 Like

Thanks a lot for the reference. I have tried simple stuff alredy but predictions are very poor. I wonder if it is possible to have credit_risk ~ s(year, x1, x2, …, xn, bs = ‘gp’). I am not too familiar with brms but the relation of credit_risk on factors x should be very nonlinear.
Another thing is periodicity. For some reason I can’t buy in that time is a random factor simply because credit_risk(t+1) depends on credit_risk(t). Moreover, I am struggling with credit_risk being yes/no. I am thinking about something credit\_risk(t+1)=\alpha+\beta*credit\_risk(t) where \alpha and \beta are GPs and x is a standard covariates such as Past Due Amount in Last 12 Months.

1 Like

One thing that would help would be to show us the models you’ve run and the the summaries.

Well if the glm/er aren’t getting you were you want the state space models might be a better next step. You can code in periodicity, interventions, and Black swan events if needed.

Oh I save all my R scripts and outputs for these kinds of models even if they don’t work. It doesn’t matter if it’s going in a academic journal, to a stakeholder/manager, or a report. Capturing the whole process is important.

Great. I am not in state space models so it could be a good time to explore those. Is there a good tutorial on those besides https://github.com/sinhrks/stan-statespace ?

I meant to write one but work and such gets in the way. If you can grab the book An Introduction to State Space Time Series Analysis but use the Stan examples that will get you most of the way there.

Just a quick question before I embark into reading - the examples in the book are for discrete outcome (road traffic fatalities in Norway and Finland). My objective is to predict credit_risk (which is binary high/low variable) for every individual in the study. I glanced briefly at formulations in various chapters and wasn’t able to find one with binary outcome.

You are right. You will need to modify the code to get the binary outcome. Let me poke around in my codes to see if I tried that.

Somewhere there is a Cormack-Jolly-Seber (binary outcome) state space model example. Maybe here https://mc-stan.org/docs/2_18/stan-users-guide/mark-recapture-models.html . I remember @Bob_Carpenter talking about? Someone else?

Another quick question. Intuitively if credit_risk of individual was high previous quarter it is quite unlikely that it become low in the current quarter, i.e. the probability of transition from state with high credit_risk to the state with low credit_risk is very small. Thus the perception that credit_risk at time t+1 depends on credit_risk at time t. Do state space models accommodate this relationship?

I stumbled upon this. I think GARMA model (actually the Zeger–Qaqish model with logistic link) might implement what I am looking for. I still think that priors for \alpha and \varphi should be GPs. Can somebody in stan’s community share an example?

You can do longitudinal GP regression with individual-specific and shared time effects using lgpr

https://jtimonen.github.io/lgpr-usage/index.html

which can model also binary outcome variables (Bernoulli observation model). If it looks like it’s what you need I can help you more.

2 Likes

Yes, this is an excellent option!

For @linas, can you tell a bit more

  • How many observations? This is important, because if you have very large number of observations, then some ways to compute GPs get very slow.
  • You write credit_risk ~ s(year, x1, x2, …, xn, bs = ‘gp’) which implies full joint GP for all time and covariates with all interactions between them. This is computationally the most challenging one. Are you sure you need all interactions between everything? If yes, then you could write that GP model in Stan language, but it would be too slow if you have more than 1000 observations, or maybe with GPUs more than 10000 observations. There are GP specific packages which could handle much more, but they can be bit more limited in what kind model structures you can easily write.
  • Easiest would be if you would be happy with additive model similar to s(year) + s(x1) + s(x2) ... as it will have the non-linearities and this can be made to scale linearly with the number of observations and you can make these with brms and with extra things with lgpr
  • The next step would be just pairwise interaction between time and each x as s(year, x1) + s(year, x2) + ... which still can be made to scale linearly
  • periodicity, interventions, and Black swan events don’t need state-space-models , but there are some models which are easier to write in state-space-form
  • state-space models are especially handy if you have just time or not too many observations per time point, but state-space models get also complicated if there are interactions between year, x1, x2, … and many observations per time. For example, spatio-temporal models which have just time, x1 and x2 is computationally challenging as soon as the number of cells in x1*x2 gets big.

If you answer some of the question above it’s easier to help, and I recommend to check out lgpr package anyway. @jtimonen has made awesome job with the vignettes.

3 Likes

Thanks to everybody. I’ll give lgpr a try. Does it handle that number of individuals decreases each quarter? The number of observations for all periods are between 1K and 2K.

Yes the number of observation times or the time points themselves don’t need to be same for all individuals.

The number of observations for all periods are between 1K and 2K.

What matters for GPs is the total number of observations. But anyway GPs in Stan with > 1K data points will be really slow.

GPs via covariance matrix implementation are slow, but low-dimensional GPs with basis functions implementation are not slow for > 1K data points. In brms and rstanarm when using s(..., bs="gp") the computation is made using basis functions. Basis function approach makes it bit more difficult to build certain models, and they get slow if high-order interactions are needed. If you are happy with additive model with no interactions or pairwise interactions, then 2K observations is not challenging. lgpr was designed considering smaller datasets and more flexibility in the modeling and thus is likely to be slow for 2K observations.

As comparison using basis function approach for GPs we can do birthday model in Stan with n=7305 observations and additive model with long-term trend, yearly seasonal periodic component, weekly periodic component, and special days effect component (this is not exactly the same model as in BDA3, but similar).

1 Like

Many thanks. I stumbled upon model in https://stats.stackexchange.com/questions/197084/binary-time-series: X_t∼Bernoulli(p_t) where p_t∼Φ^{−1}(Y_t) and Y∼N(μ,Σ) so I can impose AR(1) structure on Y and use GP as prior for AR(1) coefficients \alpha and \beta.

I already did X_t∼Bernoulli(p_t) where p_t∼Φ^{−1}(Y_t) and convergence is good but have hard time adding Y∼N(μ,Σ). My code currently is
target += bernoulli_lpmf(x | Phi(y));
I am struggling how to add y=normal_lpdf(*|μ,Σ). What should * be?

The next step would be to go with lgpr and eventually compare both approaches.

Thanks for any suggestions.

If you want Y \sim N(μ,Σ) then it is target += multi_normal_lpdf(Y | mu, Sigma). But I am not really getting how your AR(1) structure is related to this and what \mu and \Sigma are. Why do you want to have it instead of just modeling Y_t as a GP?