Stan files for models in PosteriorDB

I want to take a look at models in PosteriorDB ( GitHub - stan-dev/posteriordb: Database with posteriors of interest for Bayesian inference ). First, I tried to find a list of all available models. In R

library("posteriordb")
my_pdb <- pdb_local(path = "~/Code/posteriordb/")
pos <- posterior_names(my_pdb)
head(pos)
# [1] "arK-arK"                         "arma-arma11"                    
# [3] "bball_drive_event_0-hmm_drive_0" "bball_drive_event_1-hmm_drive_1"
# [5] "bones_data-bones_model"          "butterfly-multi_occupancy"  

In the full list, I spotted ā€œprostate-logistic_regression_rhsā€ and ā€œovarian-logistic_regression_rhsā€, which I’m guessing are models with horseshoe priors. But I couldn’t find the specific Stan files under posteriordb/posterior_database/models/stan at master Ā· stan-dev/posteriordb Ā· GitHub . Am I looking at the wrong repo?

Maybe easier to use posteriordb package functions?

"prostate-logistic_regression_rhs" |>
  posterior(my_db) |>
  stan_code_file_path()

Thanks Aki. The command returns the following:

"/var/folders/ty/36ws994x4l33ksszcdwjtrv40000gp/T//RtmpwsryKD/posteriordb_cache/models/stan/logistic_regression_rhs.stan"

But I couldn’t locate the var folder. That said, I did find the stan model locally.

posteriordb/posterior_database/models/stan/logistic_regression_rhs.stan

It must be your operating system confusing you. You don’t actually need to see anything about the path as you can just assign the file path string to a variable without seeing it (with ← or ->), and use that variable to refer to the file. Or if you want to edit it, just keep piping to get the file in your working directory

"prostate-logistic_regression_rhs" |>
  posterior(my_db) |>
  stan_code_file_path() |>
  file.copy(".")

Great

So check this posterior json

And notice the model you should look at is

and data used there is

Happy to get feedback on the user interface and vignette so that we can make it easier to find model code and data without need to look at json.

I’ve found it easier to just grab the model/data pairs I need directly, either on GitHub or after cloning.

One thing I’d recommend is a naming convention where the data for a model has the same prefix as the model. As is, I find it a bit challenging to figure out which data goes with which model.

Good point. We have to be more careful with the prefixes. The origin for not having the same prefix for models and data is that a single model can be used with data from different sources and vice versa. In posteriordb it is likley that a single data is used only with variations of a model, so we could use the same prefix for all common model variations and all data sets that are used with those variations. Also, the posterior object in the database knows both the model name and data name, so there would be a benefit of using the database instead of directly looking at model and data directories.

The posterior’s name (as eg listed in posteriordb/posterior_database/posteriors at master Ā· stan-dev/posteriordb Ā· GitHub ) is just always {dataset}-{model} though, right? I find this pretty unambiguous.

I think even something similar like this would be a nice addition to posteriordb.

Quick example made with ChatGPT (I think reference posteriors is still broken)

edit. Yes I think the color theme is horrific

2 Likes

Ping @mans_magnusson