Bayesian Hypothesis testing

razorlazor · November 24, 2019, 9:39pm

Hello. I’m new to the “Bayesian” world and wanted to ask how hypothesis testing is done properly since p-values are frowned upon amongst Bayesians. I have heard about the Bayes Factor but didn’t see a practical example yet.

Let’s say, I have the model:

y = alpha + beta * x

And I want to know if predictor x has a significant outcome on y. How do I go on from here?
Also should I do this with an R function or directly code it into my Stan file?

Here’s a minimum viable code:

R part:

## Load Data

data("radon", package = "rstanarm")
data <- radon

# Seperate variables
logradon <- data$log_radon
floorind <- data$floor
N <- length(logradon)

# Lists for STAN
cplist <- list(Y = logradon,
               X = floorind,
               N = N)

Stan part

data {
	int<lower=1> N;
	vector[N] X;
	vector[N] Y;
}

parameters {
  real alpha;
	real beta;
	real<lower=0> sigma_y;
}

model {
  vector[N] mu_y;
  

	for (i in 1:N)
	mu_y[i] = alpha + beta * X[i];
	
	Y ~ normal(mu_y, sigma_y);
	sigma_y ~ cauchy(0, 2.5);
	alpha ~ normal(0, 5);
	beta ~ normal(0, 5);
}

  //vector[N] log_lik;
  //for (i in 1:N) log_lik[i] = normal_lpdf(Y[i]| alpha + beta * X[i], sigma_y);

generated quantities {
  vector[N] y_pred;
  vector[N] indiv_squared_errors;
  real <lower = 0> sum_of_squares;
  real <lower = 0> root_mean_squared_error;

  for (i in 1:N) {
    y_pred[i] = alpha + beta * X[i];
  indiv_squared_errors[i] = (Y[i] - y_pred[i])^2;
  }
  sum_of_squares = sum(indiv_squared_errors);
  root_mean_squared_error = sqrt(sum_of_squares/N);
}

stemangiola · November 24, 2019, 9:45pm

You can check if the credible interval of the posterior distribution of your parameters do not include 0.

razorlazor · November 24, 2019, 10:39pm

It seems like the values are far away from 0 but is that “sophisticated” enough to consider them as significant variables?

stemangiola · November 24, 2019, 10:53pm

Yes, as you are working directly with probabilities that’s a pretty direct statement.

asael_am · November 25, 2019, 4:03am

Btw in classic inference, when you do an hypothesis test to check if the parameter \beta =0, is equivalent to check if 0 is in the confidence interval of \beta.

So as @stemangiola recommends, checking credible intervals is a similar procedure.

Plus, with credible intervals (CI) you actually see the probability that the real \beta (not the estimated one) is in the CI

stemangiola · November 25, 2019, 4:13am

Can you clarify this point?

asael_am · November 25, 2019, 4:23am

If you have a sample X= X1, X2, X3,…,Xn and a unknown parameter beta then the posterior is

P(beta/ X) the probability that a beta occurs now that you have incorporated the sample info.

A credible interval CI by definition is

P(beta belongs to CI / X)

So a credible interval at 95% can be tell as:
The probability the beta belongs to CI is 0.95

But in classic inference a confidence interval IC is

P( G(X) belongs to IC / beta)

That is the probability that a sample statistic(G(X) ) belongs to the IC, conditioned that there is a fixed beta is 0.95

So in classic a IC is the probability that a sample gives you a desired value. And in bayesian a CI is the probability that beta actually belongs to CI.?

razorlazor · November 25, 2019, 12:05pm

Thanks for the good explanation!

razorlazor · November 25, 2019, 12:08pm

Though, I still have a question:

I’ve learned that for frequentists they regard beta as a fixed parameter and for bayesians they regard beta as a random variable. So i can say with a 95% probability the random variable beta is between a given range (credible interval)? (is that a correct statement?)

Ara_Winter · November 25, 2019, 3:59pm

That’s my understanding. So when I report out a β (depth to groundwater) it often looks like this:
50% uncertainty interval
l-50% UI u-50% UI
-0.01 0.17

asael_am · November 25, 2019, 7:10pm

Well yes,
CI95% = [1,2]

Is the probability that beta is between 1 and 2 given the used data is 0.95

mattansb · November 28, 2019, 7:53am

You might be interested in bayestestR - has many functions for Bayesian inference, and lots of documentation.

Jean_Billie · November 28, 2019, 1:23pm

Suppose that a model has several variables,

y=\beta_0 + \beta_1x_1+...+\beta_n x_n.

How to test the null hypothesis that \beta_1=..=\beta_n=0?

Is there any alternative method for the above null hypothesis?

Comparison of the Bayes factor of two models y=\beta_0 + 0x_1+...+0x_n.
and y=\beta_0 + \beta_1x_1+...+\beta_n x_n. is a method, is there any other primitive method?

JimBob · November 30, 2019, 8:14pm

For accepting/rejecting whether a variable has an influence on another using the posterior density interval, you might also be interested to see the approach taken by Kruschke, here:

Kruschke, J. K. (2018). Rejecting or accepting parameter values in Bayesian estimation. Advances in Methods and Practices in Psychological Science , 1 (2), 270-280.
https://journals.sagepub.com/doi/pdf/10.1177/2515245918771304?casa_token=4E9imp3fHIkAAAAA:hI7tgOhq0wrdOOQbo9XzK8zodtnY1nZktUBZAu80EhwIiDNDP3QaZ-UCsHqVVyEC1jGym76jUB92nw

Slightly beyond just saying the credible values exclude zero, he suggests identifying a ‘range of practical equivalence’ where there is essentially no real difference - e.g., a very small value away from 0 might be considered as not convincingly rejecting no effect.

martinmodrak · December 2, 2019, 10:20am

Hi, just a few more thoughts on this, as I’ve just written an answer on a similar topic.

The elephant in the room: All the approaches are conditional on the model(s) you use being 100% correct (including priors etc.). Since (except some parts of physics) your model is basically never even roughly correct, model comparison/selection/hypothesis testing - whatever you call it - can be very brittle. Various approaches fail in different contexts, but there is no general way to do this “best”. You need to be roughly correct about the “important” things, but what is “important” is not fixed. A huge question is what your actual final goal is.

Some of the options you have:

Determine range of practical equivalence (Expanding on what @JimBob wrote). Strictly speaking P(\beta_1 = 0) = 0 for all continuous priors on \beta_1 (and we didn’t even start with combining multiple parameters). And that makes sense - nature doesn’t like zeroes, most things have small and/or highly variable effects, but believing the mean effect is exactly zero makes IMHO little sense. But you can use domain expertise to say that e.g. a difference of 0.5 is practically irelevant. P(|\beta_1| < 0.5) and by extension P(|\beta_1| < 0.5 \cap |\beta_2| < 0.5 \cap ... \cap | \beta_n | < 0.5) can be computed directly from posterior samples (but the probability will shrink hugely as you add more variables). If you don’t want to put a strict threshold (which you IMHO shouldn’t), you can compute the probability for a range of thresholds. Or you can compute the posterior distribution of \max_i{|\beta_i|} or of \sum_i \beta_i^2 and just make decisions based on this.

Compare a simpler model to the full model: Separately fit a model with fewer parameters, omitting some of the effects. You can use the loo package to approximate comparison of predictive performance via leave-one-out crossvalidation. Alternatively you could use Bayes factors to do that, but those can be problematic, as they are very sensitive to the priors you use in your model. Some more interesting criticism by Danielle Navarro or Data Colada. (Disclaimer: I’ve never used bayes factors myself). You get relative expected predictive performance (LOO) or improvement in relative KL-divergence to the true process (BF) of each models. Do you care about those?

Think qualitatively Danielle Navarro has a great essay about model selection and how purely mathematical approaches can fail us: Between the devil and the deep blue sea. Checking whether the models satisfy some qualitative properties can also be of interest.

Hope that makes sense and is at least a bit helpful.

avehtari · December 2, 2019, 3:01pm

Good summary by @martinmodrak. I just recorded (about hour ago) a video discussing these topics https://aalto.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=e998b5dd-bf8e-42da-9f7c-ab1700ca2702
You can skip to the first slide with title Hypothesis testing.

Topic		Replies	Views
Hypothesis testing, model selection, model comparison - some thoughts General bayes-factor , model-comparison , model-selection	6	3800	November 10, 2020
Hypothesis-testing for gaussian process General gaussian-process , hypothesis-testing	4	220	December 24, 2024
Null hypothesis testing: does x1 exclude n? Modeling rstan	4	240	March 12, 2024
Bayes for hypothesis testing in clinical trials General techniques , specification , pharmacology , hypothesis-testing	9	1255	December 13, 2023
Interpreting results of hypothesis() when comparing one-sided and point hypotheses brms interpret-results	2	1668	December 6, 2019

Bayesian Hypothesis testing

Related topics