Need Advice on launching Stan user group in Korea

Hello,

My name is Angie Moon, and I was a StanCon Helsinki attendee. After the conference, motivated by Stan’s ideal and energy, I decided to build a Stan user group in my country (South Korea). During the process, I needed advice on mainly two points: reference material recommendation & feedback on Stan Korea’s future plans. Our activities so far can be divided into two parts, as explained below:

1. Stan user groups in Korea: Foundation & Events and Talks to attract users

  • Founded Stan Korea: Created Stan Korea (facebook group) with two partners. I am in charge, Kyungmin Kwon will make contents, and Byungkwon Lee will help manage the group.

  • STEM Tech Square: STEM is an official Honor Society at the Seoul National University (SNU) Engineering School, and Tech Square is the annual event for STEM members to examine and discuss a specific topic; this year upon my suggestion we have chosen 'Bayesian Inference using Stan’ as our topic, and around 20 people have shown interest so far. I would like to attract STEM members to be the initial members of Stan Korea especially since they have great diversity in their majors and are highly competent students (additional information on STEM is listed at the end).

  • GLEAP X STEM Conference: GLEAP members are also great candidates for initial Stan Korea members. GLEAP is SNU Natural Science Honor Society. Each year, GLEAP and STEM jointly hold a conference for the purposes of academic exchange. This year the conference takes place on November 24th and I am planning to give two talks. My first talk is titled “Generative Model for Inverse Molecular Design” and its main reference is the article “Inverse molecular design using machine learning: Generative models for matter engineering” which was published on 27 July 2018 in Nature. The main focus of the article is deep generative models, but I am planning to put more weight on Bayesian sampling and its application as an example of generative model. My second talk is titled “Demand Time Series Forecasting with Bayesian Inference and Stan”, which is based on my experience for the last two years in demand forecasting.

[needed advice]

Reference materials for Tech Square and conference talks: slides, books, video lectures, and articles.

I read through Jonah’s reading list*, Stan manual, and previous Stan slides made by core developers, but I need prioritization (*https://github.com/jgabry/stancon2018helsinki_intro/blob/master/reading-list.md).

2. Creating Bayesian Educational Contents in Korean: Translation & Video lectures

  • Stan material translation: So far we have translated “A probabilistic programming language” and “Stan: a probabilistic programming language for Bayesian Inference and Optimization”. We would like to introduce hierarchical model, and are planning to translate “Multilevel (Hierarchical) Modeling: What It Can and Cannot Do”, as well.

  • Video lectures on Bayesian Inference [Theory] and [Application] (planning): We will use Youtube platform, and one lecture would be about 20-minutes. Followings are our reference materials.
    Theory: Basics of Bayesian inference and Stan, Hierarchical models, Model Assessment and Selection (StanCon, Helsinki Tutorial), BDA textbook
    Application: Basics of Bayesian inference and Stan, Productization of Stan (StanCon, Helsinki Tutorial) + Example Models from Stan reference manual 2.17.0

[needed advice]

1. For video lectures, how could the [Theory] and [Application] blended well in one play list? Would it be better to have two separate play lists?

I think statisticians would be more interested in [Theory] and users from other fields would be more interested in [Application], where they learn how to apply Bayesian Inference to their fields. Mathematical explanations would be more emphasized for [Theory] lectures whereas coding and case studies would be the main part for [Application].

2. Slides, books, lectures, and articles that we could base our video lectures on.

I have watched Ben Goodrich’s lecture on ‘Bayesian Statistics for the Social Sciences’, McElreath’s lecture on ‘Statistical Rethinking’, and Ben Lambert’s lecture on ‘A Student’s Guide to Bayesian Statistics’. Are there any further materials you would like to recommend?

Thank you!

and special thanks to Daniel Lee, who gave me courage to start posting on Stan forum.


STEM (additional information)

3 Likes

Hi, Angie. Glad you made it back home—it looked tight when you were heading to the airport.

I missed this before or would’ve responded sooner.

This plan sounds great. Congratulations on gathering an initial group of 20 people interested in studying Stan!

Some comments below.

  • most users groups like this are usually regional because people need to travel to the events

  • try to diversify beyond the initial university group as soon as possible and try to avoid the appearance of cliquishness

By “deep generative models”, do you mean something like neural networks? They’re not usually considered fully “generative” in the machine learning sense because they don’t usually model their inputs (they’re like other regressions in this sense). You can make them generative and if you assume the data model’s independent, then their model factors out (there’s a discussion of this in Gelman et al.'s Bayesian Data Analysis). The case study I did on the Lotka-Volterra model was meant to introduce general inverse modeling to a scientific audience.

For reference materials, you might want to check out the new users’ manual. Andrew Gelman added about 100 pages of introductory material. It hasn’t been released yet, but you can build it from the stan-book repo or we can send you a built copy. Andrew has an older version on his web site, but I can’t find it now (maybe he removed it).

We have a lot of introductory material under the workshops directory of the web site, which isn’t linked from anywhere, but it’s mostly at the slide level.

What else to recommend depends on the level of the users and what they want to learn. Ben Goodrich’s classes and Richard McElreath’s classes are available online. We recommend McElreath’s book as an intro to Bayesian stats. Jonah’s intros from StanCon were also recorded. Mike Lawrence also did some nice intro videos. My (and I’m guessing @betanalpha’s) dream would be a K-drama style intro to Stan. However you do them, we’d be happy to publish links.

For blending theory and application, we like to alternatie lectures and hands-on lab sessions if we have more than an hour or two. For intensive courses, we tend to do 3 hours in the morning and 3 in the afternoon combining theory, applications, and hands-on exercises. You can’t learn this from just the theory and it’s hard to do the exercises without some theory.

If you want to call someone out on the forum, it’s best to do it with their handle—for Daniel Lee, that’s @syclik. Then he’ll get a message—otherwise, none of us can keep up with the overall volume of the forums anymore.

And please write back for clarification if you want.

2 Likes

I would be very happy to watch a Korean drama where a poorly-fitting Stan program induces a love triangle emphasized through lots of freeze frames and bumpy music. Unfortunately there are higher priorities to foster a strong local community! Keep in mind that these are my opinions, based on my experience, and they clash with many others here.

The exact priorities will depend slightly on the initial audience that you want to target. For example, are you interested in recruiting active Stan users? Are you interested in recruiting potential Stan users with statistics backgrounds? Are you interested in recruiting potential Stan users with applied backgrounds? From your post it sounds like the latter so I’ll continue with that assumption.

Building statistical analyses that respects and exploits the domain expertise in applied problems is hard. It requires significant statistics training and time to dedicate to each analysis. The rewards are powerful, robust inferences and decisions but that reward is often so far into the future that people’s more immediate incentives lead them to other avenues, such as machine learning.

In order to build a sustainable community you have to reach out to those who appreciate this difficulty and are willing to put in the necessary effort. Typically these are people who have already been burned by the fragility of popular approaches that promise less work, and are desperate for better methodologies and tools. These people are scattered across applied fields, both in academia and industry, and hence hard to find!

Consequently, trying to reach out to the masses, especially by sugar-coating the realistic efforts required in of statistical modeling, is pretty ineffective and typically attracts people who quickly get frustrated with the amount of necessary work and leave. That constant churn makes it difficult to sustain the early community.

Instead I recommend making it easy for the right people to find you and then being ready for when they do.

For the former you’ll want an easily searched webpage where you collect all of your resources and help guide interested people through them. Complementary social media accounts are especially useful for broadcasting the availability of these resources and events – even if they have a small audience initially that audience can grow steadily over time and allow to reach much larger audience that you would have otherwise. If you host events then try to make the schedule as consistent as possible and, as Bob said, as open to the general public as possible. Reduce the barrier to entry into your community without making any unrealistic promises!

As for the resources themselves, I strong recommend providing as comprehensive a coverage as possible. This might include, for example, links to good introductions to calculus (especially intuitive discussions of differentiation and integration) and linear algebra which are important prerequisites for any discussion of probability and statistics. From there you can build up resources that covering probability theory, Bayesian inference, Bayesian computation, modeling building, and then modeling techniques.

One of the challenges in building up these resources is that many of the available references use difficult vocabulary or cover only parts of the necessary material. There is no great reference that covers everything, and even combining multiple textbooks can leave some holes. My best attempts at filling these holes is available at https://betanalpha.github.io/writing/, but inevitably you’ll find more holes so you will want to take advantage of the more global Stan community to ask questions and potentially develop your own material.

I know this all sounds daunting but, to abuse a cliche, building a community is not a destination but rather a journey. It helps to start small with reading or study groups or simply open discussions to gauge the interests of the initial community and focus your initial efforts of translating or developing resources for those interests and then slowly expanding as your community grows.

Good luck!

3 Likes

The thing to keep in mind is that you don’t need to get there all at once.

2 Likes

My (and I’m guessing @betanalpha’s) dream would be a K-drama style intro to Stan. However you do them, we’d be happy to publish links.

Though without a Korean drama intro, Stankorea has recorded our first official video: Introduction to Bayesian Statistics!

8 Likes