Fake data for prior predictive checks

kendyteng · April 9, 2019, 8:27am

Hi all,

I am fairly new for Bayesian statistics and Stan but I have been learned so much and it’s really exciting for me!

I started to generate fake data to check the priors. However, one question that I am encountering is that how “fake” should my fake data be? I am currently building a Poisson model with the offset and two numeric predictors in the model. It seems to me that it’s easier to predict the priors when I use the offset from the data with fake values of the predictors than having all of them faked. However, I guess it can be bad practice if not using fake values for all?

Can anyone help me with this? Thanks in advance!

torkar · April 9, 2019, 8:49am

Hi,

I just recently created a very delimited and small example for my students, but the approach is generalizable to your case I guess, which you can find here. (Heavily inspired by the book written by @richard_mcelreath, but any faults in the example are made by me of course…)

The source code for the example can be found here:

github.com

torkar/BDA_in_ESE/blob/master/toy_example.Rmd

---
title: "Example from Section 2 in the book Contemporary Empirical Methods in Software Engineering"
knit: (function(input_file, encoding) {
  out_dir <- 'docs';
  rmarkdown::render(input_file,
 encoding=encoding,
 output_file=file.path(dirname(input_file), out_dir, 'index.html'))})
author: "Richard Torkar, Carlo A. Furia, and Robert Feldt"
date: "4/7/2019"
output: html_document
bibliography: ./refs.bib
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## Prior predictive analysis
We will make use of a number of `R` packages. Both `rethinking` and `brms` allows us to develop models which we then run using `Stan`. In this example we'll see both being used. Make sure that you have the following installed (including <http://www.mc-stan.org>):
```{r packages, message=FALSE}

This file has been truncated. show original

I usually think of prior predictive analysis as simply checking that we do not allow too many absurd values (extreme is still ok, absurd is not).

Topic		Replies	Views
Quick question about graphical prior predictive check Modeling	5	610	October 10, 2022
Simulating fake data for regression in Stan Modeling techniques , specification , performance	2	1314	October 9, 2021
Implementing prior predictive checks in stan Modeling prior-predictive	7	1061	August 12, 2021
Recommended way of making posterior predictive checks available in Stan based packages General	2	1114	June 15, 2017
Prior information in varying intercepts Modeling prior-choice , hierarchical-model	3	797	August 4, 2021

Fake data for prior predictive checks

Related topics