Can you use a brms model to produce fitted predictor values?

Hello,

I am trying to create a conversion factor for zooplankton catch to convert between an accurate (hereafter referred to as actual abundance) and inaccurate (hereafter referred to as estimated abundance) sampling method. We know the estimated abundance is influenced by a number of covariates that partially determine the accuracy of this method. Ultimately, we want to produce a model that we can use to predict the actual abundance given the estimated abundance and the covariates.

However, the issue is that the thing we want to predict (actual abundance) should logically be included as the predictor in a model modeling the response variable: estimated abundance. This is because the covariates influence estimated abundance, but should have no effect on the actual abundance, and the covariates likely interact with the actual abundance to determine the estimated abundance number.

My model is:
estimated abundance ~ actual abundance * covariates

estimated abundance is zero-inflated so I am using a hurdle_lognormal distribution.

Is there any way to use brms (or any other stan-related R package) to use this fitted model to predict actual abundance from a new dataset of my estimated abundances and covariates?

Thanks,
Sam

  • Operating System: Win 10
  • brms Version: 2.9.0

Hey,

I am afraid, I am a little confused by your model. So you are saying the you want to use a variable as predictor (actual abundance) for which we have no observations off? This is going to be hard.

Alternatively, if you have an understanding of how much measurement error there is in the estimated abundance, you may incorporate this information via me() or mi() terms in brms.

Hi,

The dataset that will be used to fit the model has observations for actual abundance, estimated abundance, and all covariates. However, we have other datasets with observations of estimated abundance and the covariates, but no actual abundance. I want to fit the model on the first (complete) dataset and then use my model to predict the actual abundance in the other (incomplete) datasets.

Thanks,
Sam

I see. What we need to have then is a second model with actual abundance as response variable. Otherwise, predicting it, is not going to work. You could use the missing data framework of brms for this purpose by using mi() terms. And intro can be found at https://cran.r-project.org/web/packages/brms/vignettes/brms_missings.html