In my grant writing explorations I find it useful to see how Stan/Bayesian modelling compares with other scientific software. A big one has been deep learning packages. I found what I think are surprising results. Fairly brief write up at:
https://breckbaldwin.github.io/ScientificSoftwareImpactMetrics/DeepLearningAndBayesianSoftware.html
TL;DR We are doing way better than you might think once you throw out the computer science literature. Comments sought…
3 Likes
Not sure if this is the kind of comments sought but I have to say I am a bit surprised by the low percentage of Bayesian in Physics and Astronomy, not only because it’s a field where having models built from first principles with prior information is quite common but also considering how popular packages like emcee
are. As of today, emcee : the MCMC hammer has 6039 citations according to google scholar for 3982 of Stan : a probabilistic programming language, and I expect most of those to be astronomy papers.
That’s great. I did a scopus search on ‘emcee’ and got:
4,229 with dropoff at 2012 or so. Scopus is more conservative that google scholar.
I’d guess Stan can’t compete with emcee, deep learning is 4,803. Is emcee specialized for astro problems and is it still active? Also is it Bayesian? I assume there is MCMC that doesn’t care about Bayesian stuff. Had a look, seems plenty Bayesian to me.
Verified that the table values were correct for Physics/Astro.
It is a big bump on raw counts. New graph:
New table row:
|
Bayesian |
Deep Learning |
totals |
RStanArm |
Keras |
PyMC |
RStan |
PyTorch |
brms |
PyStan |
emcee |
Stan |
TensorFlow |
detail totals |
Physics and Astronomy |
4329/47% |
4803/53% |
9132 |
8/0% |
1928/19% |
324/3% |
23/0% |
906/9% |
31/0% |
31/0% |
3826/37% |
134/1% |
3062/30% |
10273 |
This conversation has me thinking about growth rates now, deep learning looks like is is growing faster–label covered the slope and I didn’t think about it. Log plot below:
The cumulative counts are not good, or start in 2018. Bayesians have the advantage just for being around longer. This is why I didn’t include BUGS/JAGS, that easily adds another 4k.
1 Like
That’s a very interesting analysis and graphs.
Hey @breckbaldwin this paper is interesting ([1806.06850] Polynomial Regression As an Alternative to Neural Nets), it does enforce my intuition that neural nets are really good when you plug in convolutions or a transformer architecture (and of course, if your data is so unstructured that you’ll need these kinds of stuff).
@storopoli, is that paper on the right track? If so I’ll have to start tracking polynomial regression–package name is polyreg: Here you go for CRAN downloads on the Rstudio mirror:
No my apologies. It was a joke. 😅
I was making fun of deep learning and how people are nonsense to compare deep learning frameworks with Bayesian frameworks. It is a different tool for different purposes. It’s Ok to compare {dplyr}
versus {data.table}
, but {PyTorch}
versus {Stan}
is not a good comparison.
1 Like
Ok, too slow on my end… ;)
B