# Brms: understanding the basics of predictions

Hi,

I am trying to understand the working of ‘predict’ function by manually calculating predictions and comparing them with those obtained by using the ‘brms predict’ function. However, even after setting the seed, the results differ. Unlike the predicted values, I was successful in getting the exact same fitted values.

A simple example is shown below

``````library(brms)
library(tidyverse)

set.seed(123)

# generate data
n = 100
x = sort(rnorm(n, 0, 1))
b0 = 10
b1 = 2
sd = 0.5

y = b0 + b1*x + rnorm(n = n, mean = 0, sd = sd)

dat <- data.frame(cbind(x,y))

# model fitting
bfit <- brm(bf(y ~ x), backend = "rstan", data = dat,
seed = 123, chains = 2, iter = 2000, cores = 2)

# set seed for reproducibility (predicted values)
set.seed(123)

#####################
# get fitted and predicted values by using brms functions
#####################

# fitted estimates
bfitted_values <- fitted(bfit)

# predicted estimates
bpredict_values <- predict(bfit)

#####################
# manually calculate fitted and predicted values
#####################
# parnames(bfit)
nsamples  <- nsamples(bfit)
psb       <- posterior_samples(bfit)

Xmat <- cbind(rep(1, nrow(dat)) , x)

fitted_values       <- matrix(0, nrow = nsamples , ncol = nrow(dat) )
predicted_values    <- matrix(0, nrow = nsamples , ncol = nrow(dat) )

for(j in 1:nsamples) {
eb0              <- psb[j,"b_Intercept"]   %>% unlist() %>% as.matrix()
eb1              <- psb[j,"b_x"]           %>% unlist() %>% as.matrix()
mu               <- Xmat[,1] %*%  eb0 + Xmat[,2] %*% eb1
sd <- psb[j,"sigma"]
fitted_values[j,]   <- mu
predicted_values[j,]   <- rnorm(nrow(dat) , mu, sd)
}

mfitted_values <-  apply(fitted_values, 2, mean, na.rm = TRUE)
mpredicted_values <-  apply(predicted_values, 2, mean, na.rm = TRUE)

.fitted <- cbind(bfitted_values[,1], mfitted_values) %>% data.frame()
colnames(.fitted ) <- c("brms", "manual")

.predicted <- cbind(bpredict_values[,1], mpredicted_values) %>% data.frame()
colnames(.predicted ) <- c("brms", "manual")

# compare fitted values
brms   manual
5.229839 5.229839
5.928913 5.928913
6.500177 6.500177
6.781684 6.781684
7.359954 7.359954
7.360638 7.360638

# compare predicted values
brms   manual
5.243650 5.222802
5.922818 5.925199
6.500232 6.502756
6.775731 6.775499
7.352880 7.362450
7.372583 7.344495

``````
• Operating System:
• brms Version: 2.15.0
1 Like

Hi,
I think this is to be expected - the `fitted` result (which calls `posterior_linpred`) will give you the values of the linear predictor, which is without any measurement noise. But for `predict` (which calls `posterior_predict`), the measurement noise is added, which is random and will not be the same even between two calls of `predict`.

Does that make sense?

Best of luck with your model!

Thanks @martinmodrak

I just wanted to make sure that what I am doing is the correct approach i.e., `predicted_values[j,] <- rnorm(nrow(dat) , mu, sd)`

Correct, the predicted values are not same between two calls but thought setting the seed, for example `set.seed(123)`, perhaps will take care of that. For the brms predict function, reproducibility of results can be achieved by setting the seed.

Yes, this should be correct.

If you wanted to get the same results as `predict`, you would need to:

a) set the same seed before calling `predict` and before running your own prediction code
b) generate the noise in exactly the same order as `brms` does.

Noting that `brms` iterates over observations first and then over samples we can actually reproduce this:

``````set.seed(123)
# Avoid taking summaries for stricter control
bfitted_values <- fitted(bfit, summary = FALSE)
bpredict_values <- predict(bfit, cores = 1, summary = FALSE)

for(j in 1:nsamples) {
eb0              <- psb[j,"b_Intercept"]   %>% unlist() %>% as.matrix()
eb1              <- psb[j,"b_x"]           %>% unlist() %>% as.matrix()
mu               <- Xmat[,1] %*%  eb0 + Xmat[,2] %*% eb1
fitted_values[j,]   <- mu
}

set.seed(123)
for(i in 1:n) {
predicted_values[,i]   <- rnorm(nsamples, fitted_values[,i], psb[,"sigma"])
}

# Should show all zeroes
unique(bfitted_values - fitted_values)
unique(bpredict_values - predicted_values)

``````

Dear @martinmodrak

Thank you very much for explaining it by showing an example. It’s Perfect!