Hi List,
I am running a phylogenetic regression and then I am using brms:predict()
to estimate response variables for species with predictor measurements. I’ll call them unknowns. I have phylogenetic information for the unknowns. I would like to use the vcv matrix that includes the unknowns in the prediction step so that the auto-correlation can be included in the prediction. This led me to prepare_predictions(newdata2...)
. However I am not sure that newdata2
is doing what I think it is, or what I want it to do. Here is an example:
library("brms")
library( "dplyr")
library("ape")
# read data from brms phylo tutorial
phylo <- ape::read.nexus("https://paul-buerkner.github.io/data/phylo.nex")
A <- ape::vcv.phylo(phylo, corr = TRUE)
data_simple <- read.table("https://paul-buerkner.github.io/data/data_simple.txt",
header = TRUE)
# remove first 5 species
data_mod <- data_simple %>% slice(-(1:5))
# keep first 5 species for prediction
data_pred <- data_simple %>% slice(1:5)
# quick model for demonstration
model_simple <- brm(
phen ~ cofactor + (1|gr(phylo, cov = A)),
data = data_mod, data2 = list(A = A),
chains = 2,
iter = 1000
)
# Model runs properly even though there are species in A that are not in data_mod
# Now Predict Missing Data
# Make newdata2. This matrix A contains all of the species (1-200)
phylst <- list(A)
# list elements must be named
names(phylst) <- "A"
# Run the prediction with the newdata2 argument
p1 <- predict(model_simple,
newdata = data_pred,
newdata2 = phylst[1],
allow_new_levels = TRUE)
# Run without the newdata2 argument
p2 <- predict(model_simple,
newdata = data_pred,
allow_new_levels = TRUE)
These two predictions, methods, p1
with and p2
without newdata2
, provide the same results (with a bit of variation based on the unique runs). I am finding the same thing with my data. Also p1
runs just as fast as p2
. I expect it to be a bit slower. These things combined make me think it’s not using this additional data from the vcv matrix in the prediction, but I have no way to tell if the data give the same result or the method is ignoring the matrix.
Am I using newdata2
correctly? I can’t find any examples of its use anywhere! @paul.buerkner I am tagging you here because I think you may be the only one who knows the details of this.
Thanks!