Non-linear modelling - Michaelis-Menten

prototaxites · May 27, 2021, 2:47pm

Hi all,

I’m quite new to Bayesian modelling, and have spent the past couple of months working through Statistical Rethinking. I’ve also read Bayesian Models by Hobbs and Hooten, which has really inspired me to go a bit beyond GLMMs and think a bit more about the processes that underlie the data generation process. I thought it might be fun to apply some of what I’ve learned so far to one of my own datasets, hopefully cementing some of what I’ve picked up, but I’m running into all sorts of fun new problems.

The dataset in question contains data on the mycorrhizal fungal species richness of pine seedlings. Seedlings from each of 12 genotypes (“families”) were grown in the field at each of 18 locations (“grids”), with seedlings from each family replicated 3 - 5 times at each location. Seedlings were collected after three months and the species richness of fungi was counted. Each fungal species occupies a single root tip, so we also counted the total number of root tips on each seedling (this varies quite a lot from 0 to ~500). The question we would like to answer is if pine genotypes differ in their fungal species richness.

When I worked on this data during my PhD, I modelled this using a GLMM, which gave very lacklustre results. Thinking about the data process a bit more, however, species richness is likely to saturate as the number of root tips increases, as a seedling samples fungal species from the available pool at any given location. I thought it might be fun to try and model this using a Michaelis-Menten equation, modelling V_{max} (the saturation point of the curve) as a function of genotype and location. However, I’m running into some issues fitting the model and I’m not quite sure where I’ve gone wrong.

The model I’m trying to implement is:

\begin{align} R_{ifg} &\sim Poisson(\lambda_{fg}) \\ \lambda_{fg} &= \frac{(V_{max} \times tips)}{k + tips} \\ V_{max} &= Family_f * Grid_g \\ k &\sim Lognormal(1, 1) \\ Family_f &\sim Lognormal(\mu_f, \sigma_f) \\ Grid_g &\sim Lognormal(\mu_g, \sigma_g) \\ \mu_f &\sim N(1, 1) \\ \mu_g &\sim N(1, 1) \\ \sigma_f &\sim Exponential(1)\\ \sigma_g &\sim Exponential(1)\\ \end{align}

I modelled V_{max} as the product of the Family and Grid parameters, with the idea being that the Grid parameter captures the maximum species richness at a location, but that different families may capture different proportions of this richness (if there are any differences). My understanding is that as long as both $V_{max} and k are positive, then \lambda is positive and thus I don’t need a log-link. To ensure that, I specified Lognormal distributions for Family, Grid and k, and set lower bounds for these values at 0.0001.

I have implemented this model as follows, using ulam in the rethinking package:

SeedMod <- ulam(
  alist(
    richness ~ dpois(lambda),
    lambda <- (vmax*tips) / (k + tips),
    vmax <- fam[family]*gr[grid],
    k ~ dlnorm(1, 1),
    fam[family] ~ dlnorm(fam_a, fam_sig),
    gr[grid] ~ dlnorm(grid_a, grid_sig),
    fam_a ~ dnorm(1, 1),
    fam_sig ~ dexp(1),
    grid_a ~ dnorm(1, 1),
    grid_sig ~ dexp(1)
  ), data = with(fungi_data, list(richness = Richness,
                                            family = as.integer(as.factor(Family)),
                                            grid = as.integer(as.factor(Grid)),
                                            tips = Tot_tips/max(Tot_tips))),
  cores = 4, chains = 4, log_lik = TRUE, cmdstan = TRUE,
  constraints = list(fam="lower=0.0001", gr="lower=0.0001", k = "lower=0.0001"),
)

Which produces the following Stan code:

data{
    int richness[673];
    vector[673] tips;
    int grid[673];
    int family[673];
}
parameters{
    real<lower=0.0001> k;
    vector<lower=0.0001>[12] fam;
    vector<lower=0.0001>[18] gr;
    real<lower=0> fam_a;
    real<lower=0> fam_sig;
    real<lower=0> grid_a;
    real<lower=0> grid_sig;
}
model{
    vector[673] lambda;
    vector[673] vmax;
    grid_sig ~ exponential( 1 );
    grid_a ~ lognormal( 1 , 1 );
    fam_sig ~ exponential( 1 );
    fam_a ~ lognormal( 1 , 1 );
    gr ~ lognormal( grid_a , grid_sig );
    fam ~ lognormal( fam_a , fam_sig );
    k ~ lognormal( 1 , 1 );
    for ( i in 1:673 ) {
        vmax[i] = fam[family[i]] * gr[grid[i]];
    }
    for ( i in 1:673 ) {
        lambda[i] = (vmax[i] * tips[i])/(k + tips[i]);
    }
    richness ~ poisson( lambda );
}
generated quantities{
    vector[673] log_lik;
    vector[673] lambda;
    vector[673] vmax;
    for ( i in 1:673 ) {
        vmax[i] = fam[family[i]] * gr[grid[i]];
    }
    for ( i in 1:673 ) {
        lambda[i] = (vmax[i] * tips[i])/(k + tips[i]);
    }
    for ( i in 1:673 ) log_lik[i] = poisson_lpmf( richness[i] | lambda[i] );
}

However, the model fails - although it compiles fine, it fails to find starting values for all chains, and just dies:

Chain 4 Rejecting initial value:
Chain 4   Gradient evaluated at the initial value is not finite.
Chain 4   Stan can't start sampling from this initial value.
Chain 4 Initialization between (-2, 2) failed after 100 attempts. 
Chain 4  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
Chain 4 Initialization failed.
Warning: Chain 4 finished unexpectedly!

Where have I gone wrong here? The model samples OK if I include a log-link (with about 1% of transitions divergent), but dies completely without it, which suggests to me that it’s trying out impossible values for \lambda. However, from my understanding, with the bounds I specified, this shouldn’t happen? Any insight anyone might have would be really invaluable!

jsocolar · May 27, 2021, 3:52pm

Can multiple fungus species occupy a single root tip, or is it the case that each tip will have either zero or one species?

prototaxites · May 27, 2021, 3:54pm

Technically, yes- you an get multiple fungi on a single root, but for the purposes of the model (and with the limitations of our identification process!) there is only one species per root tip. Root tips also don’t necessarily have to have any fungi on them at all.

jsocolar · May 27, 2021, 4:05pm

Are there any data points with 0 tips but nonzero richness?

prototaxites · May 27, 2021, 4:07pm

None!

jsocolar · May 27, 2021, 4:11pm

Try (if you haven’t already) removing points with zero tips and zero richness (they aren’t informative anyway, and I think they might be producing infinite gradients).

prototaxites · May 27, 2021, 4:20pm

That worked a treat - I get samples now! Still getting quite a few divergent transitions (may need to think about the parameterisation here), but at least it runs.

Can I ask how these data points would produce infinite gradients?

jsocolar · May 27, 2021, 5:07pm

I’m not entirely sure, but my guess is that some of the functions used in calculating the Poisson log probability mass function (lpmf) do weird things if lambda is zero. Note that in most applications, lambda will be declared with a lower-bound constraint of zero, and so it will never actually reach a literal zero during sampling.

Stan doesn’t “understand” that lambda[i] is a constant when tips[i] is zero; it still takes a gradient of the lpmf with respect to lambda. My guess is that even though the gradient of the Poisson lpmf for an observed zero with respect to lambda is finite at lambda = 0, the implementation that Stan uses isn’t designed to handle the case where lambda literally reaches zero.

I don’t really know C++ and so I can’t take the time now to really understand the code, but my best guess is that the multiply_log function here goes haywire or is discontinuous when given zero as an argument.

github.com

stan-dev/math/blob/92075708b1d1796eb82e3b284cd11e544433518e/stan/math/prim/prob/poisson_lpmf.hpp

#ifndef STAN_MATH_PRIM_PROB_POISSON_LPMF_HPP
#define STAN_MATH_PRIM_PROB_POISSON_LPMF_HPP

#include <stan/math/prim/meta.hpp>
#include <stan/math/prim/err.hpp>
#include <stan/math/prim/fun/as_column_vector_or_scalar.hpp>
#include <stan/math/prim/fun/as_array_or_scalar.hpp>
#include <stan/math/prim/fun/as_value_column_array_or_scalar.hpp>
#include <stan/math/prim/fun/constants.hpp>
#include <stan/math/prim/fun/is_inf.hpp>
#include <stan/math/prim/fun/lgamma.hpp>
#include <stan/math/prim/fun/max_size.hpp>
#include <stan/math/prim/fun/multiply_log.hpp>
#include <stan/math/prim/fun/promote_scalar.hpp>
#include <stan/math/prim/fun/scalar_seq_view.hpp>
#include <stan/math/prim/fun/size.hpp>
#include <stan/math/prim/fun/size_zero.hpp>
#include <stan/math/prim/fun/sum.hpp>
#include <stan/math/prim/fun/value_of.hpp>
#include <stan/math/prim/functor/operands_and_partials.hpp>

This file has been truncated. show original

Edit:
Here’s the c++ for multiply_log, which suggests that the above explanation is probably the correct one:

github.com

stan-dev/math/blob/92075708b1d1796eb82e3b284cd11e544433518e/stan/math/rev/fun/multiply_log.hpp

#ifndef STAN_MATH_REV_FUN_MULTIPLY_LOG_HPP
#define STAN_MATH_REV_FUN_MULTIPLY_LOG_HPP

#include <stan/math/rev/meta.hpp>
#include <stan/math/rev/core.hpp>
#include <stan/math/rev/fun/log.hpp>
#include <stan/math/rev/fun/elt_multiply.hpp>
#include <stan/math/rev/fun/multiply.hpp>
#include <stan/math/prim/fun/constants.hpp>
#include <stan/math/prim/fun/multiply_log.hpp>
#include <stan/math/prim/fun/is_any_nan.hpp>
#include <cmath>

namespace stan {
namespace math {

namespace internal {
class multiply_log_vv_vari : public op_vv_vari {
 public:
  multiply_log_vv_vari(vari* avi, vari* bvi)

This file has been truncated. show original

prototaxites · May 28, 2021, 9:12am

Thanks, that makes sense!

Thanks for all your help, with a bit more finagling I have a model that works… okay-ish! Certainly well enough for me to move on from this toy example. Confirms the expectation that there is no evidence for a genotype effect, so I’m happy.

nhuurre · May 28, 2021, 9:34am

Strange, because multiply_log() exists specifically to avoid problems with 0\times \log 0. You’re right though, the gradient divides by zero when it shouldn’t

github.com

stan-dev/math/blob/92075708b1d1796eb82e3b284cd11e544433518e/stan/math/rev/fun/multiply_log.hpp#L37-L42


      
          class multiply_log_dv_vari : public op_dv_vari {
           public:
            multiply_log_dv_vari(double a, vari* bvi)
                : op_dv_vari(multiply_log(a, bvi->val_), a, bvi) {}
            void chain() { bvi_->adj_ += adj_ * ad_ / bvi_->val_; }
          };

jsocolar · May 28, 2021, 3:30pm

Does this look like a bug, or just an unfortunate limitation of the implementation?

nhuurre · May 28, 2021, 3:35pm

It’s a bug. The gradient is uniformly zero when a is constant and zero.
We should just special-case a=0.0 like we do with a=1.0

github.com

stan-dev/math/blob/92075708b1d1796eb82e3b284cd11e544433518e/stan/math/rev/fun/multiply_log.hpp#L83-L86

    
      
          if (a == 1.0) {
            return log(b);
          }
          return var(new internal::multiply_log_dv_vari(a, b.vi_));

jsocolar · May 28, 2021, 3:37pm

I’ll open an issue

Edit:

jsocolar · May 28, 2021, 3:49pm

@prototaxites really appreciate your curiosity here! It led somewhere useful.

Topic		Replies	Views
Non-linear growth models: Addressing Parameter Underestimation in Limited Data with Nonlinear Models and Bayesian Hierarchical Approaches using STAN applied in Aquaculture Modeling rstan , fitting-issues , brms	0	364	May 18, 2024
Model comparison methods General ecology	3	5276	November 26, 2017
Who tells the truth: GLMM or Bayesian? Modeling	5	808	September 30, 2018
Blog posts on ecological modeling in Stan Modeling	6	252	February 11, 2025
Random effect in the model Modeling	2	376	August 29, 2020

Non-linear modelling - Michaelis-Menten

Related topics