Modeling C3 photosynthesis in BRMS


I am a plant ecologist and I am working with process based models. One of the aspects of process based models is that they are informed with parameters coming from empirical data. In my approach I am using brms hierarchical models to obtain a posterior distribution of each parameter and then inform the process based model by sampling from those distributions. However some parameters for the process based models are obtained by fitting non-linear equations. For instance, the parameters b and c from a weibull function to characterize plant’s xylem vulnerability to embolism:

K=K_{max}*e^-\left ( \frac{P}{b} \right )^{c}

# Fit the model in which the parameter(s) we are interested change as a function of elevation----
weibull_ML_formula = bf(formula = PLC ~ (100*(1-exp(-((-WP/b)^c)))),
                        b + c ~ 1  + (1|plot|elev_band),nl = TRUE)

# Prior Specification
weibull_ML_prior_pipo = c(set_prior(prior = "normal(3.8,0.3)",class = "b",nlpar = "b",lb = 2),#,lb = 1.8),#prior on PIPO b
                          set_prior(prior = "normal(2.5,0.2)",class = "b",nlpar = "c",lb = 1.2),#,lb = 1),
                          set_prior(prior = "normal(0,0.3)",class = "sd",coef ="Intercept",group = "elev_band",nlpar = "b"),
                          set_prior(prior = "normal(0,0.01)",class = "sd",coef ="Intercept",group = "elev_band",nlpar = "c"))#,lb = 1))

# Initial values
inits_l = list(inits = list(b_Intercept = 3.5, c_Intercept = 2,sd_elev_band__b_Intercept=0,sd_elev_band__c_Intercept=0.05),
               inits = list(b_Intercept = 3.5, c_Intercept = 2,sd_elev_band__b_Intercept=0,sd_elev_band__c_Intercept=0.05),
               inits = list(b_Intercept = 3.5, c_Intercept = 2,sd_elev_band__b_Intercept=0,sd_elev_band__c_Intercept=0.05),
               inits = list(b_Intercept = 3.5, c_Intercept = 2,sd_elev_band__b_Intercept=0,sd_elev_band__c_Intercept=0.05))

# Fit model
vc_fits_pipo <- brm(formula = weibull_ML_formula,
               data = vc_md[vc_md$SpeciesID %in% "PIPO",],family = gaussian(),
               prior = weibull_ML_prior_pipo,
               inits = inits_l,
               iter = 30000,
               warmup = 10000,
               thin = 3,
               cores = n_cores)

This model fits great, and the results are consistent with other approaches to fit the same function in a frequentist framework using non-linear squares. A more complicated non-linear model is the one describing photosynthesis in C3 plants. This model can be described by a set of non-linear equations used to predict carbon assimilation and then a function that minimizes the difference between observed and predicted carbon assimilation:

A_{n} = min\left ( A_{c}, A_{j} \right )-R_{d}

A_{c} = V_{cmax}\cdot \left [ \frac{C_{c}-\Gamma^{*}}{C_{c}+K_{c}\cdot \left ( 1+O/K_{o} \right )} \right ]-R_{d}

A_{j} = J\cdot \left [ \frac{C_{c}-\Gamma^{*}}{4C_{c}+8\Gamma^{*}} \right ]-R_{d}

C_{c} = C_{i}-\frac{A_{n}}{g_{m}}

Measured data:
A_{n}: net carbon assimilation rates
C_{i}: carbon dioxide concentration in the leaf

K_{c}: Michaelis constant of Rubisco for carbon 27.238
O: Partial pressure of oxygen at Rubisco 21 at sea level
K_{o}: inhibition constant of Rubisco for oxygen 16.582
g_{m}: leaf resistance to carbon diffusion, can be measured but is also set constant to 0.3
\Gamma^{*}: is the concentration at that oxygenation is twice the rate of carboxilation, in other words CO2 compensation point 3.743

Non-linear parameters to be estimated:
R_{d}: mitochondrial respiration
V_{cmax}: Rubisco’s maximum rate of carboxilation
J: electron transport rate

The one challenge is that the non-linear functions are fitted in two different areas of the curve between A_{n} and C_{i}. This transition point used to be arbitrarily picked ( Nowadays there is an R package that gives the option to automate this process (Plantecophys - An R Package for Analysing and Modelling Leaf Gas Exchange Data). I was wondering if fitting this kind of model would be possible at all using brms, as having the posterior distribution while accounting for the hierarchical structure of the data will be powerful. Otherwise I will just fit the curves and then model the non-linear parameters using a multilevel model. Some approaches have been developed in JAGS, but incorporating multiple levels is really challenging in JAGS. Attached you might find data on photosynthetic responses to Ci. The variables needed to fit the model are:
A_{n}: Photo
C_{i}: Ci_Pa
level 1: indiv
level 2: species

This is my first time posting a topic, apologize if I didn’t do it properly.


ACI_data_example.csv (50.7 KB)


I’m not a brms user, so someone who is can correct me if I say anything wrong, but brms is an interface for fitting linear models that mimics R’s lme4 package – although it allows some non-linear extensions it is not as general as explicit Stan language implementations.
The modern equivalent of BUGS or JAGS is Stan (as well as any other statistical modeling software that allows implementation of [nearly-]arbitrary models), so if you are trying to use it with Stan-based implementations the way to go is Stan language itself.

1 Like

@caesoma agreed, I’ve found the nonlinear syntax in brms useful for simple nonlinear models that are expressed in a single equation, a recent example for us was an allometric scaling model. This type of model with a system of nonlinear equations won’t be compatible with brms to my knowledge.

That being said @gevargu it might be feasible to use brms to obtain some example Stan code for components of this model, that you can then modify and build on. Nonlinear multilevel models like this are fairly accessible to write in Stan, the examples I’m more familiar with are population pharmacokinetic models.

1 Like

Disclaimer (again): I probably only used brms once or twice for some case study and out of curiosity, and lme4 and equivalents a few times, so don’t take my word for it as any absolute statement about what brms can or cannot do.

I agree that these should be some of the more abundant and well documented examples (with a dedicated library), but there should also be COVID-inspired epidemic SIR-like models by the dozen, although their implementations and modeling soundness probably varies since it attracted researchers with very different expertises and different reserach focus.

Sorry if I was unclear, I’ve used it quite a lot recently and I’m pretty confident that models like the OP example would be really tricky to make with brms if possible at all. I’d say definitely a better candidate to work directly in Stan.


@AWoodward and @caesoma , I think you both are right as this system of NL equations could be very tricky to fit given the structure of brms. I don’t know how to use STAN, only JAGS. But it is definitely part of my bucket list. Thanks for the insight.

1 Like