Applications of dynamic causal modeling

Has anyone applied some of the “dynamic causal modeling” as proposed by Friston, in stan, especially outside neuroscience?

This paper Gradient-based MCMC samplers for dynamic causal modelling mentions Stan for example.

I also found a reference to Variational Laplace, and was curious what it is.

Does DCM require a different sampler?

4 Likes

I don’t know almost anything about those models, but since nobody else answered, I will give it a try.

The model in the paper looks like it should - at least in principle - be possible to fit with Stan. Since the paper mentions some of @betanalpha’s work to motivate NOT using Stan for the model, I would be interested in hearing Mike’s thoughts on whether the criticism is sensible and if it also applies to later evolution of the Stan sampler (because Stan != NUTS and a lot of work has been done on Stan since 2016)

Wouldn’t that be variationl inference with using the Laplace distribution as the approximating family?

Hope you can move forward with your model.

I have a lot of hesitations about how these models are employed in practice (abusing model selection methods to try to “learn” causal structure, implicitly defining prior models to avoid chaotic behavior in dynamical systems, etc) but the models themselves are technically within the scope of Stan. In other words they can readily be implemented as well-defined Stan programs.

Stan, however, will also not be quiet about computational problems inherent to these models that hand-written samplers might ignore (the included papers list many references to support their hand-written samplers, but they also make some common misunderstandings about how Markov chain Monte Carlo works). Although frustrating these diagnostics will allow you to investigate and understand these less-than-ideal model consequences in a way that no other method will, which makes it a rare feature.

The most subtle challenge here will be the chaotic nature of the dynamical systems. Stan’s ODE integrators will do their best, but once the dynamical systems become unstable the numerical evaluation of the states and the gradients will tend to decouple which will degrade the performance of Stan’s dynamic Hamiltonian Monte Carlo sampler.

tl;dr Yes these models can be implemented in Stan, but they may be challenging to fit. That said they’re no easier to fit with other methods and Stan is uniquely capable of guiding investigations of the fitting problems.

4 Likes

Many interesting works can be done by (partially) implementation of DCM in Stan rather than hand-written HMC, fairly point-out by @betanalpha

DCM is a well-established framework for analyzing neuroimaging modalities (such as fMRI, MEG, and EEG) by neural mass models (a set of SDEs/DDEs) where inferences can be made about the coupling among brain regions (effective connectivity) to infer how the changes in neuronal activity of brain regions are caused by activity in the other regions through the modulation in the latent coupling.

In the field of neuroscience, DCM has been extensively and fruitfully used, particularly, for constructing and fitting generative models of measured blood oxygen level-dependent (BOLD) signals.

The refereed publication here is the only effort that I have seen to use HMC for DCM. Scaling of DCM to whole-brain level is a very challenging problem (e.g., inference at the order of >84 coupled neural mass models rather than a few coupled equations in the standard DCM and addressing the degeneracy isssue).
Since the computational tractability is critical for DCM users (as they use MAP/VI), Stan definitively can do much better there! Relying on SPM software which is in Matlab, and along some other technical issues (e.g., fft for spectral-DCM), all DCM codes cannot be implemented in Stan, but some wrapping functions can be made to benefit from the Stan algorithms (@maedoc knows better).

We have an approach called BVEP (see Refs), which implements coupled slow-fast systems in Stan (centered/noncentered and validated by all diagnostics!) applied to epilepsy spread, that shares a key aspect with DCM: Bayesian inference over dynamical system models. However, there are key differences in practice (such as estimation over the connection between regions using tractography techniques rather than effective connectivity, and linearization, …). Recently, we have managed to speed-up our implementation that reported in Refs, and also to solve some of degeneracy issues (by carefully reading the great Michael’s works), which are in preparation. In sum, many more interesting works can be done here for neuroscientist to enjoy Stan!

Refs:

  1. The Bayesian Virtual Epileptic Patient: A probabilistic framework designed to infer the spatial map of epileptogenicity in a personalized large-scale brain model of epilepsy spread - ScienceDirect
  2. On the influence of prior information evaluated by fully Bayesian criteria in a personalized whole-brain model of epilepsy spread
  3. Data-driven method to infer the seizure propagation patterns in an epileptic brain from intracranial electroencephalography
  4. Identifying spatio-temporal seizure propagation patterns in epilepsy using Bayesian inference | Communications Biology
  5. Virtual Epileptic Patient (VEP): Data-driven probabilistic personalized brain modeling in drug-resistant epilepsy
2 Likes

The DCM papers can be challenging to read because they are a tightly coupled mixture of data analysis techniques, modelling & reparametrization tricks along with the explicit derivation of update schemes for their variational approximations. That said, nothing prevents fitting a DCM model in Stan apart from degeneracies which foil HMC but are otherwise ignored by their variational approximation.

Commenting on the so-called spectral DCM: it is derived from a fixed point approximation of a complex cortical microcircuit model (a bunch of ODEs), which is formulated as a transfer function, which is converted to the complex cross spectra of the model (as a function of parameters). This does require complex numbers but is doable in Stan as well.

This is a good point, but they see themselves as simply taking a practical approach, as the full model e.g. in Stan wouldn’t be computationally feasible because it’s too degenerate. The results are frequently published in journals where Bayesian analysis is considered advanced, so I don’t think there’s much push back.

Lastly it should also be noted that in many cases, Bayesian-ness is both a method (i.e. DCM toolbox, Stan as data analysis methods) and a “result” in the Bayesian-brain theory sense. The latter is seen e.g. [2201.06387] The free energy principle made simpler but not too simple among other papers. My guess is that, given their theory that the brain (among other systems) embodies and applies variational approximations to adapt to its environment, VI should be fine for data analysis as well.

2 Likes

I think one general lesson here is that many of the “modeling” approaches popular in applied fields are not just probabilisitic models but, as @maedoc notes, “tightly coupled mixture of data analysis techniques, modelling & reparametrization tricks”. In order to understand whether these approaches might be useful one has to devolve all of those inputs, identifying the probabilistic model, its implementation, and inferential computation techniques and determining which if any might be relevant to a new application.

Writing up a model in the Stan Modeling Language forces one to separate the model and its density representation and the powerful diagnostics of Stan’s Hamiltonian Monte Carlo sampler provide informative feedback on the kinds of likelihood functions that can arise from that model. This more deliberate approach is almost always more work, but that works quickly pays off with better understanding of the analysis and its feasibility.