Hello ,
I am new to Bayesian modelling. I am an ecologist. I am trying to model the visitation frequency of a species of bird to a species of plant. My dataset has several variables: frequency_f (response variable), elevation (numerical variable with discrete values), season (categorical variable with two levels), flowers (numerical variable also with descrete values). I want to run a mixed effect GAM model with a smoothing term for elevation using brms R package. I decided to set a smothing term for elevation after checking the relationship of these two variables via a plot. Frequency of visitation seem to peak at the middle elevations and then it drop again. This is the formulation of the model:
elevation_all_plants_mbrms ā brms::brm(frequency_f ~ s(elevation) + season + flowers + (1|plant_ID),
data = hypericum_all, family = hurdle_gamma(link = ālogā), chains = 5,
iter = 10000, warmup = 3000, cores = 4)
I chose the hurdle_gamma family since my response variable is continuous and itās zero inflated. It canāt take negative values. Thus, I want to see if the visitation frequency of the birds is affected by elevation, season, and the number of flowers each plant has. As random factor I chose each plant individual. After running the model with default priors, these are the results :
Family: hurdle_gamma
Links: mu = log; shape = identity; hu = identity
Formula: frequency_f ~ s(elevation) + season + flowers + (1 | plant_ID)
Data: hypericum_all (Number of observations: 242)
Draws: 5 chains, each with iter = 10000; warmup = 3000; thin = 1;
total post-warmup draws = 35000
Smooth Terms:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sds(selevation_1) 2.03 1.16 0.56 5.05 1.00 1111 420
Group-Level Effects:
~plant_ID (Number of levels: 242)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.33 0.25 0.01 0.88 1.01 569 285
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -2.33 0.26 -2.83 -1.84 1.01 1528 2375
seasonWetMdry 0.28 0.24 -0.17 0.75 1.00 2398 1209
flowers 0.00 0.01 -0.01 0.02 1.00 11113 4558
selevation_1 0.55 3.05 -6.11 6.13 1.00 4797 1892
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
shape 2.47 2.98 1.28 9.73 1.01 520 216
hu 0.54 0.03 0.48 0.60 1.00 7140 18740
If I understood well the hurdle_gamma family will model the zero VS non-zero, and then the non-zero values will be modelled. Thus, I will get a probability of non-zero and then the estimates for each of the effects for the non-zero values. Is this correct? Is the choice of this family correct? None of the tested effects plays a role in the visitation frequency of birds. However, now I want to run the model again with informative priors:.Here it is the code :
prior1<- c(set_prior("normal(0.1031292, 0.5)", class = "b", coef = "Intercept"),
set_prior("uniform(2270,3530)", class = "b", coef = "s(elevation)"),
set_prior("normal(0,1)", class = "b", coef = "season"),
set_prior("truncated_normal(0, 12.33058, 11.98781)", class = "b", coef = "flowers"),
set_prior("normal(0,1)", class = sigma))
I chose those priors based of some descriptive statistics of my data and my knowledge of how it should be. Thus, I considered that the Intercept of the model canāt be negative and it should be centered around the mean. For Elevation I set a uniform probability for every elevation and I gave the range of elevation present in my dataset. For season, I set a big SD to account for the effect of the two seasons. For the number of flowers, again it canāt be negative and it should be cantered around the mean and with the SD of the actual data . Finally, the random effect has a huge SD to account for each individual plant present in the dataset.
I do not know if this makes. Can someone help me out?
Also, when I try to run this in R I got the following error:
> prior1<- c(set_prior("normal(0.1031292, 0.5)", class = "b", coef = "Intercept"),
+ set_prior("uniform(2270,3530)", class = "b", coef = "s(elevation)"),
+ set_prior("normal(0,1)", class = "b", coef = "season"),
+ set_prior("truncated_normal(0, 12.33058, 11.98781)", class = "b", coef = "flowers"),
+ set_prior("normal(0,1)", class = sigma))
Error: Processing arguments of 'set_prior' has failed:
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, :
arguments imply differing number of rows: 1, 0
Can someone help me out to make it work?
I am running brms in:
R version 4.3.1 (2023-06-16 ucrt) ā āBeagle Scoutsā
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type ālicense()ā or ālicence()ā for distribution details.
R is a collaborative project with many contributors.
Type ācontributors()ā for more information and
ācitation()ā on how to cite R or R packages in publications.
Type ādemo()ā for some demos, āhelp()ā for on-line help, or
āhelp.start()ā for an HTML browser interface to help.
Type āq()ā to quit R.
Brms version:
packageVersion(ābrmsā)
[1] ā2.19.0ā