Hi Stan community,

I am new to this forum so please bear with me. For some context, in my work I would like to examine individual-level differences in the response of my study species toward a more frugivorous diet. My response variable is “prop.fruit” which represents the proportion (in volume) of the fecal sample that is comprised of fruit. This value can range between 0.00 and 1.00 and after plotting my data, it is clear that it follows a zero-inflated beta distribution. I believe that using the brms package to fit an HGAMM will provide the necessary flexibility I need to appropriately model the various variables in my system.

My model is currently specified as the following:

library(mgcv)

library(brms)

library(readxl)

library(cmdstanr)

data ← read_excel(“~/Directory/data.xlsx”, sheet = “data1”)

data$ind ← as.factor(data$ind)

data$town ← as.factor(data$town)

data$town_ind ← paste0(data$town, “_”, data$ind)

fit ← brm(bf(prop.fruit ~ s(time.index, by=ind, m=1, bs=“tp”) +

s(ind, bs=“re”) +

(1|town) +

(1|town_ind),

phi ~ 1 + (1|town) + (1|town_ind) + bls + cas + oas + sps + srs,

zi ~ 1 + s(time.index) + (1|town) + (1|town_ind) + blp + cap + oap + spp + srp),

family = zero_inflated_beta(),

data=data,

chains=4,

iter=2000,

warmup=1000,

cores=14,

seed=1234,

backend=“cmdstanr”)

**time.index** = integer variable representing the sampling day (from 1 through 548).

**town** = the population of this species I am working with. Categorical random effect with three levels.

**ind** = categorical random effect of the specific individual I collected the data from.

**town_ind** = a concatenated category representing each given individual in its own “town.” None of the individuals moved from one town to another over the course of the study.

**bls:srs** = 5 different focal plant species whose seeds were recovered from fecal samples. This is an integer count of seeds recovered in the fecal sample where prop.fruit is collected from. I wanted to use this data to model the dispersion of the phi parameter in the ZIB distribution.

**blp:srp** = the same 5 focal plant species as above, but this is a record of the ripe fruit phenology for each of the 12 calendar months in a year. The phenology data was recorded for each plant species by summing the presence/absence of ripe fruits throughout the course of multiple years in a particular month and then dividing that number by the total number of observations made for that species in that month. This was used to estimate the proportion of observations that contained ripe fruit for each species in each calendar month.

I want to note that this is my first experience with any Bayesian analysis, and I’ve learned very quickly that this is a pretty complex problem I have here, especially because I think it would be important to account for time-lags in the blp:srp variables, but I’ll save that for later.

For now, I wanted to ask how could I specify the prior of the zi() portion of my model, when I expect the presence of zero values to follow a Bernoulli distribution where the probability of obtaining a prop.fruit value of 0 is about 75%? I can’t seem to find any guidance on how to specify a Bernoulli prior within brm().

Thank you for your help and in welcoming me to the Stan community in my first post.