Implementing phylogenetic logistic regression for binary response variable in Stan

dbuona · March 19, 2019, 1:07pm

Greetings,
I’m a rather new Stan user and I’m working on building a Bernoulli logistic regression model that corrects for phylogenetic correlation. For reference, I am basing this on the phylogenetic logistic regression models of Garland and Ives (2010) and the R function phyloglm() in the package phylolm.

My basic thought is to modify the Bernoulli example model code from github (reposted below) to included additional independent variables and a variance/covariance matrix to account fo phylogeny. My sense is that this would have to be a hierarchical model in which theta itself is modeled with the independent variables and the phylogenetic correlation matrix, but I am not sure how to begin implementing this.

data { 
  int<lower=0> N; 
  int<lower=0,upper=1> y[N];
} 
parameters {
  real<lower=0,upper=1> theta;
} 
model {
  theta ~ beta(1,1);
  for (n in 1:N) 
    y[n] ~ bernoulli(theta);
}

Further, Garland and Ives (2014) developed a Bayesian version of their phylogenetic GLM for MCMCglm, with the following formula:

01%20AM
The random variable u represents residual variation, and s is a random effect that captures hypothesized covariances in the data. The link to the book chapter that presents this equation in more detail can be found here.

I am wondering:

Has anyone done this kind of modeling before in Stan and would be willing to share some example code?

Does anyone have suggestions about how to implement the above mentioned formula from Garland and Ives in Stan?

Thanks!

Ara_Winter · March 19, 2019, 8:35pm

There is a phylogenetic regression example here:

gist.github.com

https://gist.github.com/rmaia/5936586

phylostan_Ktrees.R

require(rstan)
require(geiger)
require(MCMCglmm)

# load data
data(geospiza)
dat <- geospiza$geospiza.data

# create fake sample of trees
tr <- drop.tip(geospiza$geospiza.tree, 'olivacea')

This file has been truncated. show original

and I swear I’ve seen a few others. Let me dig around in my stan models folder.

dbuona · March 26, 2019, 6:55pm

Thanks for sending this along-- definitely a helpful example. What I am a bit stuck on how to modify such a model for a dataset with a binary response variable. If you, or anyone out there, has any ideas or examples in that realm I would greatly appreciate it.

Ara_Winter · March 26, 2019, 8:44pm

Your y (response) would be something like this:
int<lower=0,upper=1> y[N]

in the data section (totally the easy answer here ;) ).

I’d need to think about the model section. We need to roll in, maybe the bernoulli_logit ? I bet there is some elegant way to do this.

Ara_Winter · March 26, 2019, 8:56pm

So this discusses

the binomial models in phylo regressions including an example using MCMCglmm. It looks likes they recommend probit instead of logit.

Ara_Winter · March 27, 2019, 1:04am

Oppps you already have that same source! I’ll have some time tomorrow so I can try and fiddle with the binomial model

Topic		Replies	Views
Including Phylogenetic Information in a Bayesian Hierarchical Model for Predicting Species Trait Modeling cmdstan , rstan , brms	1	81	September 19, 2024
Phylogenetic regression in Stan: can this model be improved? Modeling	5	112	October 3, 2024
Phylogenetic signal for bernoulli families by brms brms fitting-issues , specification	13	1333	December 4, 2020
Multivariate logistic model and Cholesky correlation matrix Modeling rstan , techniques	3	216	December 3, 2024
Improving efficiency for dynamic coevolutionary Stan model Modeling	12	119	August 8, 2024

Implementing phylogenetic logistic regression for binary response variable in Stan

Related topics