Additions/improvements to Stan's website that would make Stan more easily accessible and help with its Googling

It seemed like the topic “Addressing Stan speed claims in general” was starting to move in this direction, so I figured I would create a new thread to consolidate discussion regarding additions and modifications to Stan’s website.

I’m not a Stan contributor, but as someone who relies on Stan for research, I have an interest in the future maintenance and growth of Stan’s community and development. Paraphrasing and drawing from the prior thread about addressing unfair benchmarks to bolster Stan’s reputation and Stan’ding, I share sentiments with some other users who feel that the Stan website could probably benefit from a reorganization to help with attracting and onboarding potential users.

Based on some conversations I’ve had with ecologists and biologists at my institution (UC Irvine), it seems like one thing that convinces them to roll their own Metropolis Hastings MCMC algorithms to their detriment is a barrier with getting started in Stan in terms of website organization and Google-ability (in addition to lack of familiarity with statistics). There are tons of example MCMC codes that require little understanding of Bayesian statistics to get started. Some of them give up quickly on Stan when they don’t see @Bob_Carpenter’s case studies right away and pull up few example codes in Google. I myself initially resorted to using a self-coded rudimentary MCMC algorithm back in 2015 when I was tasked with working on a model comparison example in computational neuroscience and eventually felt too bad about repeatedly bothering the Google Group with questions since I didn’t find enough code examples. I eventually lucked out in grad school by taking a course that taught Stan that helped me understand Stan’s value and introduced me to PPL usage.

This Stan Discourse has certainly improved the ability to pull up previously asked questions and the website has definitely improved. However, the barrier to entry still looks like it could be smoothed a little. I’ve had to send links of code examples to folks on Reddit that they weren’t otherwise able to pull up in Google when they got stuck in Stan. Searching for “Stan getting started” or “Stan tutorial” pulls up a lot of links on the first search page that aren’t from the MC Stan website. The result from the MC Stan page that does come up is not a guide itself, but a collection of links to other guides, some of which are not aimed at a more general, non-field-specific audience.

Once someone has started using Stan and has gotten into more of the specifics, they then encounter guides, details, and case studies being spread across different pages in a manner that does not flo. For example, information about the loo package is not immediately linked to on the Stan documentation page. After some Googling about loo, we find this page which links to this more detailed page. The former page is redundant, and perhaps the second page should have been directly linked to on the Documentations page. Additional helpful information does not appear on any pages on Stan’s website, but is found on @avehtari’s website. loo aside, we see that information is frequently spread across multiple places which increases the burden for maintenance and makes it easier for pages to go out of date. Some information is on the Stan website, while other information is located on the webpages of Stan developers and contributors.

In summary, below are some things that I think would benefit the Google-ability and ease-of-use of Stan’s website and make it a more centralized experience, particularly for new users:

  • A general tutorial that is not just a link to other tutorials located perhaps in a new “Getting Started” tab of the website navbar.
  • Re-done sections for guides and case studies in the “Documentation” page that link to @avehtari and @betanalpha’s guides and writings as an intermediate step toward the eventual creation of more centralized and prominent hubs on the website for modeling case studies and information about loo. Perhaps the guides and case studies should also be on their own page on the navbar, though then it would be important to not have too many buttons for aesthetic reasons. As is though, I think a couple more buttons would be okay.
  • To further help with the Googling and highlighting of updates and new Stan-related research, establishing a Stan blog that is separate from the Gelblog. That would have been a good place to highlight @bbbales2 and @charlesm93’s optimization of the COVID study code (that would likely be a good candidate for a case study write up, too).

Anyhow, those are my potentially worthless two cents. I hope that they overlap with some other folks’ sentiments and can be of some use to the Stan leadership. Are there other website updates and improvements that people would like to see?

EDIT: I’m an idiot. A thread encompassing this already exists: Website Redesign Suggestions

7 Likes

I would argue that the issues raised here actually encompass more than website redesign!

The problem with a centralized on-boarding process is that currently the community is actually not centralized in pedagogical perspectives, best modeling practices, and best Stan practices, let alone the best way to introduce new users to statistical modeling, statistical computation, Bayesian inference, and the like. For example I personally no longer spend any time trying to argue with people who believe that rolling their own statistical algorithms is easier than using an established tool or want to learn how to use Stan in a weekend without any background in probability theory or modeling; instead I focus on those who have tried, payed enough attention to see that its failed, and are ready to invest the time and effort to learn how to do it properly. This, however, is very much not a universal perspective and the range of perspectives in the case studies demonstrates this.

That introduction is made even harder with the wicked diversity of experience in potential users. Without first understanding from where someone is coming it’s nearly impossible to design an appropriate on-boarding process. That’s why every “introduction to Stan” blog that you read looks so different – the authors are all writing the introduction optimized for their particular background when they started.

Ultimately I believe that the best approach is having a diversity of pedagogical materials at various levels, assuming various backgrounds, from various authors that are consistent in the concepts they use so that a potential user and move from one to the next without getting confused. This, however, is incredibly challenging without a centralized editing process to define and maintain that consistency.

The process for adding case studies to the website is very much not that process. Early on when the project was largely contained to one institution the process for adding material was not particularly formal – anyone could add anything they wanted. But as the project has expanded, and perspectives within the have project diverged, this procedure no longer functions particularly well, especially given that many in the community have the impression that anything posted on the website carries a formal “approved by Stan” designation, for whatever “Stan” means.

Anyways, the challenge here is much deeper than design. It requires a careful definition of what documentation is “official” verses what documentation is simply “curated” so that users can make reasonably informed choices about what materials to read. Just my two cents.

3 Likes

Yes, thanks for pointing that out. I definitely agree with this.

And also, yep, it makes sense that I was only able to get onboarded with Stan through a course that was also able to teach me basic probability theory. All very pertinent points.

@betanalpha, do you think something like a decision flowchart would help for directing people to certain official/curated materials? Like starting with “Do you have cursory knowledge of probability theory? (Yes/No) —> Do you have knowledge of R? (Yes/No)…” And some of the “No” answers would refer people to other sources covering those basics. The Intro to Probability, for example, probably should not happen on the Stan website.

It has been considered in the past, but one of the big challenges is that most people don’t actually know what they don’t know, especially when their classes, advisors, etc are telling them that they actually do know (let alone any incentives/pressures they might be under to get an analysis done quickly and with a few little prerequisite learning as possible).

A more productive option might be to have a run down of necessary conceptual comprehensions within each in topic, but that will still be limited by how well those concepts can be communicated. Still, might be worth a try.

1 Like

Thanks for these suggestions. I’m tagging @imadmali who’s been keeping track of the website feedback for the SGB.

2 Likes

Thanks for the feedback.

The web site’s definitely a mess. It went through a consolidation phase that reduced the number of tabs but made each page much deeper. There are also cross-cutting pages that got thrown up by packages like rstan.

The other big issue here is that “A” needs to be “Several”, becuase we have multiple interfaces. The same intro tutorial we provide for R isn’t going to work in Python or Julia, for example.

The problem we’ve had with anything that requires work is that nobody wants to do it. We had an issue of a newsletter that fell by the wayside, too.

It’s not really clear who this is any more. The current SGB isn’t getting involved in technical governance and there’s no technical leadership remaining after I and then Sean resigned the post of technical working group director.

It’s true that there’s not official technical leadership (although the SGB did put out a proposal for a voting procedure on technical decisions and is working on editing it based on feedback), but I don’t think these website suggestions (like making things easier to find) depend on that. The current SGB is definitely getting involved in website redesign.

2 Likes

And in the technical issue resolution proposal! Thanks. I’m very excited about the technical governance proposal, the first draft of which is already really great.