I am working on the extremely sexually size dimorphic spider Nephilingis cruentata. I track the carapace width (CW) of individual spiders throughout their development, from shortly after hatching until adulthood. Since spider growth is discrete, I measure CW after each molting. Consequently, I have individual growth histories for 20 spiders per family across approximately 13 families. Each family is determined by the mother (dam).
In addition to carapace width and family, I also track the following variables:
Time of CW measurement - measure.date
Sex of the individual (determined partway through the process, so this information is not yet available for all individuals) - sex
Father (sire) of each individual (like dam it is the same for the entire family) - sire
Hatching date (hatch.date), from which I calculate the age of the spiders at each molting - age.measure
Developmental stage (juvenile, subadult, or adult) - status
Other parameters of lesser importance at this stage
I aim to create an MCMCglmm model of these growth curves, separated by sex, to analyze how growth patterns differ between sexes. Specifically, I want to determine if related spiders of each sex exhibit significantly similar growth patterns (e.g., growth speed, time between moltings, and changes in carapace width with age).
I have attached some plots to better illustrate the phenomenon and a small portion of my dataset structure to show how the data is organized (the data are in long format table where each carapace measurement has own line, so there are more then one line for each individual spider with spider identifier called animal - the first column in the dataset).
I would greatly appreciate any assistance with creating the MCMCglmm model for this task, or suggestions for alternative approaches that might be suitable. I am relatively new to coding and advanced statistical data analysis, so any help would be invaluable.
Welcome to the Stan forums! You may or may not find users with experience with MCMCglmm here. However, fitting your model with Stan will likely be quicker/more efficient and users here can assist with that.
If that interests you, I suggest taking a look at the different Stan interfaces: e.g. cmdstan or rstan which have you write your own code or brms which will automatically generate Stan programs via a formula interface similar to other multilevel packages in r, and seeing which best suites your needs/programming experience.
Thank you very much for your reply and suggestions. I am not strictly tied to MCMCglmm, so using the Stan interface, like brms, would work just as well, as long as it accomplishes the task.
I am definitely interested in exploring any possible solutions using those packages. Do you have any ideas on how to approach my specific data and address the problem? One issue is that my growth paths are not linear. Even if I log-transform both age and carapace width, or just one of these parameters, linearization isn’t really suitable for linear models.
I would suggest fitting a simple linear growth model first so that you can get a feel for how the Stan"verse" works w.r.t model specification, extracting results/plotting, model checking, etc.
When you say nonlinear, what do you mean exactly? If the growth is still smooth you can probably just specify a spline or some other additive model (which I think brms has syntax for). If the growth happens in jumps at a discrete number of time points you might need to get more creative, particularly if the jump points are not known and/or vary between individuals (something like a compound poisson process, perhaps with time varying intensity comes to mind).
Okay, I will try that today and see how it goes. If I encounter any issues, I will reach out to you and the community again. Hopefully, I can eventually achieve some useful results.
By nonlinear, I mean exactly what you mentioned at the end of your reply—the growth occurs in jumps, so linear modeling might not be ideal. However, to start, I think I can try using a linear model at least at the population level. From there, I can move towards more complex individual growth modeling. Does that sound like a reasonable approach?
Yeah, the general idea is that even though you are fitting a model you know isn’t correct you’ll be able to see where/why it fails and be able to expand it as necessary.
With regards to the individual growth trajectories, provided that the moltings don’t occur at fixed times, the simplest model might be a compound poisson process where the moltings occur randomly at times following a Poisson process with intensity parameter \lambda and the amount of growth at each growth follows some positive distribution, say \text{lognormal}(\mu, \sigma). These parameters could then be incorporated in the multilevel structure. Additionally, you could also incorporate individual covariates and time. E.g. allowing \lambda and/or \mu to vary with time would allow for an increase/decrease in molting frequency/growth amounts as the subjects age or including sex as a covariate would allow for different growth patterns for males/females.
Fitting the model above is quite a bit more complicated than a simple multilevel linear growth model but is probably still quite doable.
Thank you very much again. It’s really helpful to have someone who can provide useful tips and instructions. I truly appreciate it.
I have one more question. I now understand most of the tasks I need to complete to build a useful model. However, I am still struggling with setting informative priors. Based on the scatterplots I posted in my additional request for help, how can I set priors for the male and female populations?
I am honestly quite lost on this topic and don’t know where or how to start. Without appropriate priors, I can’t begin any modeling, so I would greatly appreciate some practical guidance on how to approach this with my particular data.