Teaching PhD Students About the Stan Community

fergusjchadwick · July 30, 2024, 10:06am

Hi All,

I am teaching an intro to Bayesian statistics class aimed at first year maths and stats PhD students (in the UK). I am keen for the students to engage with the Stan support network, both as users and potential contributors, and I would like input on how to do this in a way that maximises learning while avoiding generating a lot of extra work for the community.

For example, I think key “soft skills” for the students to learn are to correctly form Discourse help requests, GitHub issues, and maybe pull requests (probably for documentation given their level). I can get them to do these things artificially (e.g. they can just post these things on the internal class pages) but maybe some of these things would be useful to do “live”, both for the students and the community?

Do we have community guidelines on how we should approach this type of thing? Maybe we could add a special tag to questions posed as part of classwork.

Bob_Carpenter · July 30, 2024, 6:32pm

Hi, @fergusjchadwick, and thanks for asking. I think it’s great that you’re teaching these things. I don’t think there’s anything special you need to do. I’d rather just treat your students like the rest of our users (or developers, if they get ambitious). We try to be friendly, but of course, if you dump 100 PRs on us, we’re not going to be able to review them all in a timely manner for your class.

Before picking up issues to solve in GitHub, you might want to ask if they’re still relevant. We do try to keep the issues up to date, but it’s a struggle given the scale.

The other thing we welcome is case studies. I looked you up and see you’re doing ecology models at Glasgow (I should be in Edinburgh, where I was a Ph.D. student, early September, before StanCon in Oxford). I love ecology models and would be glad to help people formulate them. One thing we’ve had success doing in the past is things like going through a book and translating its examples to Stan. I think we already did that with Marc Kéry’s population ecology book in BUGS, but it could probably use a new update and some wrappers for users.

fergusjchadwick · August 9, 2024, 11:27am

Hi Bob,

Thanks for the quick reply and helpful input!

Great suggestion re case studies. The final assessment for the course is going to be to adapt and extend a case study (last time I did it we used the HMM one) and I was going to encourage the authors of the better submissions to get in touch with the original authors about submitting an update. In general, I get them to add some prior predictive checks and update the syntax where relevant (or use newer functions if they’ve been superseded).

With regard to statistical ecology specifically, I’m keen to get someone to look at implementing HMSC (HMSC | Statistical Ecology | University of Helsinki) in Stan (or to see if it’s possible and identify some of the fitting issues that might be hidden in the current implementation).

Hope to maybe bump into you while you’re in Scotland! I’m lecturing in St Andrews now so even more statistical ecologists to talk to. And if not, will definitely see you at StanCon!

Bob_Carpenter · August 13, 2024, 7:19pm

Oops—I missed this response when you first posted.

It was a bit of a challenge to find their model specification—it’s in the first supplement to the paper linked behind the web site:

https://besjournals.onlinelibrary.wiley.com/action/downloadSupplement?doi=10.1111%2F2041-210X.13345&file=mee313345-sup-0001-AppendixS1.pdf

I’m confused by the presentation because rather than just writing down the density, they try to write down all the conditionals for Gibbs. Looks like they labored quite hard to get conjugacy, which we don’t need in Stan. Is there a clean description of just the model somewhere?

For efficient implementation, all the Kronecker products would need to be unpacked so we never had to explicitly compute them.

So this looks doable, but it’d be a lot of work.

P.S. The authors are a bit confused about JAGS, which does exactly what they’re suggesting for GLMs:

However, unlike a standard Gibbs scheme such as those implemented in generic MCMC
software (e.g. JAGS or BUGS), our sampler updates simultaneously not only scalar parameters, but entire vectors and matrices of parameters

fergusjchadwick · August 14, 2024, 11:17am

Hi @Bob_Carpenter,

Thanks for following up. I think this is probably the cleanest presentation of just the model: https://doi.org/10.1111/ele.12757.

I agree re amount of work but I hope would pay off given how popular the package implementation has become.

amynang · August 15, 2024, 8:19am

They also have a book which may or may not offer additional information on specification. Perhaps you even have institutional access?

fergusjchadwick · August 15, 2024, 8:22am

Hi @amynang,

Thanks! I have the book but it obviously spreads the description over a decent chunk of text while they’re explaining the different parts of it. That paper is the most concise version.

nicholasjclark · November 13, 2024, 7:07pm

The mvgam 📦 can now handle much of this functionality. There is a detailed description of how it works in the jsdgam help page. It doesn’t currently use a multiplicative Gamma for the factor loadings, but the types of effects you can use are far more flexible. For example, you could ask how species’ nonlinear responses are related to their phylogenetic or functional relationships, or you could allow the factors to evolve as reduced rank MRFs to handle aerial spatial data. I’m also planning to implement Sarah Heaps’s functionality to allow the prior for the loadings to be informed by traits and / or phylogeny. Very happy to take suggestions for improvement of course.

Bob_Carpenter · November 13, 2024, 8:35pm

I just saw this now with the latest follow-up. That’s cleaner than most stats papers’ model presentation, but it’s still very narrative and has lots of callouts for details to other papers. For example,

To facilitate the estimation of such matrices, we use a latent variable approach, which allows a parameter-sparse representation of the matrix X through latent factors and their loadings (for mathematical details see Warton et al. 2015b; Ovaskainen et al. 2016a,b).

I really wonder why it’s such a challenge for statisticians to write down their whole model in a paper. We tried to do that with Pathfinder and the referees got angry at us for telling people things they already knew; the details were not well known in that the only place Lu could find them was in 30-year-old Fortran source code.

To evaluate a matrix normal distribution, i.e., y \sim \textrm{normal}(\mu, \Sigma \otimes \Omega), you want to think of it as a matrix normal distribution and evaluate the density without ever formulating the Kronecker product. See, for example, Matrix normal distribution - Wikipedia

Topic		Replies	Views
Book request for lending library: McElreath: Statistical Rethinking: A Bayesian Course with Examples in R and Stan General	3	1177	July 13, 2017
Bayesian Class Videos Publicity videos	3	3117	March 1, 2019
Workshop on Bayesian Data Analysis at the Ecology Across Border meeting Publicity ecology	10	1674	December 21, 2017
PhD Using Stan to Formally Integrate Ecology and Ethics Jobs phd	2	168	January 21, 2025
Portuguese (Brazilian) users General	2	470	February 12, 2021

Teaching PhD Students About the Stan Community

Related topics