Hello, fellas. I am an ornithologist currently trying to fit a model on the territory sizes of a species of bird (call it A). When I mapped 20 territories of these birds I noticed that some territories were defended by pairs and others by trios (i.e., two and three birds). Though my main goal was only to report the territory sizes, I have been thinking about running a ‘simple’ model in brms in the form:
model <- brm(territory_size ~ 1 + number_of_birds,
data = df,
family = gaussian(),
iter = 40000, warmup = 20000, chains = 4, cores = 4)
However, I have prior information on the territory sizes of two closely related species (call them B and C), and I want to include this information in the model. The issue comes in how to proceed about it. Other studies have reported average territory sizes of 13.3 km^2 and 42.5 km^2 for species B and median territory size of 19.5 km^2 for species C. Based on this information I decided to set a prior with a value somewhere in between the reported ones: 21.0 km^2, with standard deviation wide enough to allow to a range of values: 8.0 km^2. The problem I have is how to include this in the previous model. So far my approach has been to run:
prior <- prior(normal(21000, 8000), class = "Intercept") #values in square meters
model <- brm(territory_size ~ 1 + number_of_birds,
data = df,
family = gaussian(),
iter = 40000, warmup = 20000, chains = 4, cores = 4)
I run prior predictive checks on this and they made sense. The model runs just fine. For practical purposes this model leads me to the same decision that I would have with a flat prior: the difference between territories with two and three birds is similar to zero.
My question to you is: Is this a correct way to include prior information in the model?
And a side question, would you suggest anything about how to choose the values that should be used for the prior? I know there is not such a thing as ‘the correct prior’, but as of now choosing values seems to be a mix of domain knowledge and a hunch.
This is the data:
territories_areas_feb10.csv (435 Bytes)
Any help would be greatly appreciated. Any comment regarding a flagrant mistake in my model and reasoning is also very much welcomed.