Flocker: an R package for occupancy modeling with `brms`

Happy to announce that @simon_mills and I have made it to CRAN with a new package called flocker that will enable you to fit a variety of occupancy models using brms as a backend, with freedom to use the full power of brms syntax in formulas for occupancy, detection, colonization, extinction, and autologistic terms, as applicable to the model type.

The fundamental challenge in harnessing brms to these likelihoods is that they are not fully factorizable. Our solution is to evaluate the likelihood in an unlooped brms custom family, into which we smuggle all of the necessary indexing information via vint terms (we provide a function to create a data object with the necessary formatting an indexing information embedded). If you want to read more about how that works, check out this vignette:

Within the above website, you’ll also find a reasonably complete set of vignettes/articles: everything from a tutorial, to SBC results for our likelihoods, to formal descriptions of the models we fit. For the latter, see also the preprint here:

bugs, issues, comments & contributions here:

Big thanks to @paul.buerkner for some crucial advice, for tolerating our clunky approach*, and most of all for brms itself.

*Note that when I say “clunky approach” I mean it’s a bit clunky on the backend. The user experience should be pretty streamlined, and the computation quite performant.

15 Likes

This is really cool. I have dreams of making an mark-recapture package using brms as the backend. Something like a Stan-powered version of the packages marked or RMark. You’re an inspiration. Looking forward to digging into your bag of tricks.

3 Likes

We’re generally happy to help people with these kinds of extensions. I wrote the basic Cormack-Jolly-Seber models in our User’s Guide chapter on latent discrete parameters and would be happy to help with formulating Stan models. But I know very little about brms other than its philosophy. If you are interested please email rather than using the forums as it’s more likely I will see it: bcarpenter@flatironinstitute.org

1 Like

Very cool! Congrats @jsocolar and @simon_mills!!! I’m looking forward to applying flocker in my own work.

@Dalton, yes, a user-friendly interface to STAN mark-recapture models would be incredibly useful. I’m just starting a new position, so my availability to actually contribute has a high degree of uncertainty in the immediate term, but I’d hope that’d smooth out over the next year or so. Whether my contributions would be of any value is another significant question mark, but I have a reasonable amount experience in these models and have dabbled in coding them in STAN. I can, at the very least, guarantee that such a package would have a user-base of one!

2 Likes

That’s really cool! We are also building a package called bmm interfacing with brms, but for measurement models in psychology, and almost ready to submit to CRAN as well. I’m really curious to look at your code and see if we can learn how to do something better, since it’s quite a similar idea of “hacking” brms, but for different types of models.

It would be nice to have somewhere a list of such packages that build on top of brms for estimating custom models. Are you aware of other packages like that, or an existing list?

(edit: I guess packages on cran that us brms will be listed in the reverse dependencies, so I can compile a list of those that fit the bill. Any way to do that for non_cran packages?)

@paul.buerkner if you are up for it (though I understand if you don’t want that), it could be nice to have such a list somewhere on the brms website, maybe under a new section of the “Other packages” tab?

3 Likes

R-universe lists also dependents that are not on CRAN, see e.g. for brms R-universe search “needs brms”

2 Likes

Cool, thanks, I was just starting to make a list! I did not know about R-universe. Does this automatically index packages on github, or does someone has to have added their package manualy to R-universe to appear there in this needs search?

1 Like

This is as comprehensive a list as I could make now.

Sources

Selection criteria

  • Package provides an interface to brms::brm() to fit custom models (rather than provide other functionality for brms models, or using primarily other functions from brms, such as distributions)
  • not project specific (e.g. not a package for a specific paper or project; should allow for general use)

List

Comments

Some of these are more flexible than others. Some user very similar approaches, definiting new analogs to brmsformula, brmsfit, get_prior, make_stancode, etc. Some are thin wrappers around typical models but for specific usecases, others define custom families and “hack” brms to fit them.

I plan to make a more detailed comparison and analysis (and post it separately, not to dilute jsocolars thread), and see if anything useful (for example suggestions for brms to make the process easier) comes out of this. A lot of these packages are just coming out in the last year or are under development. Seems like a trendy idea and speaks to how amazingly useful and flexible brms is. Maybe we can think of a broader framework for doing this instead of everyone having to reinvent the process (I was completely unaware of all of these, but I’m just struck by how similar we do some things)

5 Likes

Automatically scraped, see rOpenSci | How r-universe searches for packages on CRAN / Bioconductor. In April 2023

  • 10.805 CRAN/Bioc packages found at the Git url mentioned in the DESCRIPTION file (yay, you rule!)
  • 1.983 packages found under the maintainer’s personal Github account
  • 4.613 packages ingested from the CRAN/Bioc mirror in the maintainer’s universe
1 Like

That’s an amazing list! Thank you!

Having that under “other packages” would be a bit much and too overwhelming for the reader (and the screen). Also, there we have the policy to only list packages directly from the Stan team that we explicitely endorse.

One idea that comes to mind is to have a list on my personal website similar to the list of brms-related blog posts: Paul BĂĽrkner - A list of blog posts about brms

This page could then be directly linked to in the brms readme and would thus appear on the landing page of the brms website too.

What do you think of this idea?

5 Likes

@Ven_Popov I’ve finally had a closer look at bmm and it looks amazing! Across these various packages that extend brms, it seems like their goals generally include one or more of:

  • Adding a specialized custom family
  • Providing a UI that translates between “familiar” notation as seen in the subject-matter literature and valid brms formulas
  • Providing developer tools to lubricate the above (bmm only)
  • Providing specialized post-processing functionality specific to model types.

I’m kind of blown away by how useful bmm looks to be for the first three bullets, and I wonder if you would have any interest in providing the developer tools portion of the package as a stand-alone framework that people can clone and use to build their own packages.

Unfortunately, in flocker we couldn’t come up with the sort of solution that could fit inside the bmm framework. There are at least two issues that prevent this, both arising from the complicated indexing problem that sits at the heart of our data spec. One issue is that we need to generate the custom families on-the-fly at runtime, because the Stan snippet for the family depends on the number of repeat visits in the dataset. The other issue is that we need to reimplement most of the brms post-processing functionality in order that it play nicely with our format, since predictions, log-likelihood calculations, etc depend on the dependency structure encoded by the indexing. Still, bmm is an inspiration for opportunities to streamline and modularize the flocker codebase.

4 Likes

@paul.buerkner That sounds great!

1 Like

@jsocolar that’s great feedback to hear! We are hoping to make bmm as general and user-friendly as possible, so we are indeed aiming for a very modular design in order to allow a lot of model flexibility.

I agree with how you’ve characterized the goals of such packages. I’m planning to write a bigger review of these packages, with the aim of proposing a general framework for developing interfaces for custom models to brms.

As part of that, I have been thinking exactly in the direction you suggested - abstracting the tools in the current bmm package into a “developers package”. But I want to think more carefully about how to do this in the best way.

That said, we are also hoping that people can contribute models directly in bmm. Our goal for this package is to provide a domain-general interface for translating complex models into objects that can be fit with brms. We have written a guide for how to do that - BMM Developer Notes. The package started much more modestly with a goal to implement a few cognitive models, but we think the framework as it is now (and possibly with some additions), can be applied much more broadly.

I will have to look more closely at your code, but in principle I think all of that should be doable with some extensions to bmm, or a bmm-like developer framework:

  • “generate the custom families on-the-fly at runtime” We currently don’t have that, but I think it should be easy to add. We currently have an S3 method called “configure_model”, that takes in the model object, the user-provided formula and the data, and which generates valid brms formulas and custom families for each model depending on the class attribute of the model. Our custom families models load the stan code from files, but you could specify any function within the method to generate the code - the only requirement is that the method returns a list with formula, data, family, prior, stanvars to pass to brms. How you generate those within the method function is up to you
  • " we need to reimplement most of the brms post-processing" - we are starting to do that as well. There are two ways to currently do that in bmm, depending on the stage:

Before returning the fit object to the user

  • an S3 method postprocess_brm() is called in the final step before returning the fit object to the user, and applies postprocessing of the fit object. This is again a very general method, and each model can specify its own method. This is useful for applying any transformations or adjustments to the information saved in a brmsfit object

Functions applied to the returned fit object

  • the default postprocess_brm() method attaches class bmmfit to the brmsfit object. This allows us to define any additional postprocessing methods. For example, we are currently working on a custom summary.bmmfit() method, which will be applied whenever users call summary() on the fit object. We want this because we want to hide some of the ugly internal components of the formulas we construct to translate our models into functioning brm formulas
  • the returned bmmfit object has stored the original model class in an argument bmm_model as well, so we can define variations of all postprocessing methods such as summary, posterior_predict, etc for individual models, with the end result that user just call the generic method as they would do with brms, and bmm automatically applies the correct method depending on the model used

I’m happy to chat more about this if you are interested, either here or somewhere else! I’m not suggesting that you abandone your awesome work on Flocker and reimplement everything in bmm, but I’d be really curious to see what it would take to implement models like yours, whether what we have is already sufficient, or what would need to be added to support a broader array of models.

Once we get to a CRAN release, we are planning to write an article and a more detailed guide on the purpose of the package, its scope, a developers guide, etc.

3 Likes