Mixed effects von Mises models using a unit vector?

Dear Stan community,

BACKGROUND: I have been modelling animal behaviour with circular outcome data according to a von Mises distribution using brms. I have been using multilevel models, expressed using brms to implement this. One difficulty I have encountered is chains iterating around the circle, reaching angles like -7pi (-22) before converging on the same answer it could have found several rotations earlier. This is particularly problematic when mean angles are close to pi / - pi. I’ve managed to solve this problem for fixed-effects models using the unit vector to model circular distributions in rstan, as per Von_mises documentation suggestion , which did well for angles near pi (which brms performs poorly at), but otherwise gives the same answer as brms. However, I would like to construct more complex models and implement group-level effects.

For example,Mixed Effects Circular Statistics using BRMS.pdf (881.7 KB)

I (i) generated 2000 simulated observations in two groups, one with a circular mean close to 0 and one with a circular mean close to pi,
(ii) implemented a ‘mixed’ model in brms incorporating group level effects.

The brms model incorporating population and a very limited set of individual level effects failed to extract the correct mean angles from a dataset. In a variation of the model I used priors that restrict estimation to (-pi,pi), but that doesn’t solve the fundamental problem that angles are not appropriate for estimation in the linear domain.

QUERY: Would it be possible to estimate the mean angle as a unit vector in brms, then convert that back into an angle to calculate the lpdf?

  • Operating System: macOS High Sierra 10.13.6
  • brms Version: 2.9.0
1 Like

What you can do it creating a custom family (see ?custom_family) and use the code that Bob posted in the issue you linked to. That way you can use the unit vector formulation in brms without me having to implemented a native family for this special purpose in brms.

Thanks for the speedy reply.
Good idea, I’ll look into that and post up the implementation if I get it working.

TLDR

  1. I have working stan code, but would like to be sure that it is actually modelling: mu ~ 1 + (1 | animal) on a circle, as well as log(kappa) ~ 1 + (1 | animal).
  2. I would like to know how to make a more general family of von Mises models in brms where mu is estimated as a unit_vector, using custom_family, stanvars etc.

Details
I’ve now had some success with implementing the unit_vector transform as a link function for mixed-effects models using the von Mises distribution.


Unfortunately, I just haven’t been able to get my head around post processing the Stan model, or the appropriate syntax for adding this transform to a brms custom_family. I’ve tried reverse engineering brms generated code handwritten.uvMEmises1.stan (3.2 KB) in the hope that I can work out how to define the custom_family. The main problem seems to be that if I follow the original post faithfully, then this requires the introduction of a new parameter to be estimated within the main modelling loop, not just a new link function and likelihood function. Any advice would be much appreciated!

In the original post the lpdf was calculated for mu_vec as well as raw angles (should be the same I guess?), so I’ve added this into the optimisation loop:

  // likelihood including all constants
    if (!prior_only) {
      for (n in 1:N) {
        target += von_mises_lpdf(Y[n] | mu[n], kappa[n]);
        //add lpdf from the unit_vector estimate
        //I am not sure how Stan updates mu_vec
        target += von_mises_unitvector_lpdf(mu_vec | mu[n], kappa[n]);
      }
    }

Where von_mises_unitvector_lpdf is the unit_vector version of the default brms von_mises_real_lpdf (which I was able to insert as a stavar).
Using custom_families I can only insert this into the modelling block as a separate loop outside of the main modelling loop unit vector MEvonMises brms attempt.R (7.7 KB) , since stanvars inserts code between the parameter declaration and modelling block by default. This is particularly problematic when kappa is estimated using the log link (which is generally a good idea) since this link is applied below the unit_vector loop I can insert.

It would be great to have a brms method for modelling this for any number of animals and fixed effects. My colleagues and I work with this kind of dataset a lot and I would be a lot more comfortable using brms for fitting and post-processing.

I am somewhat confused. Isn’t von_mises_unitvector_lpdf supposed to replace von_mises_lpdf? Why are both present in the above model? I don’t yet fully understand the unit vector model, so it may make sense if you write down the model in mathematical notation and perhaps I can then give you a more helpful answer. In any case, for use with the custom_family feature, it would be ideal if everything would be happening in the single _lpdf function that is named after the family and whose code is provided in the function block via stanvars.

My apologies. You are right to be confused, I had misinterpreted the original post and the code above is not appropriate. von_mises_unitvector_lpdf should actually be used as a prior on mean angle outside of the if(!prior_only){ statement.

//e.g. prior that mu = 0°
target += von_mises_unitvector_lpdf(mu_vec| 0, 1);

von_mises_lpdf was actually still used in the original post, but with an estimate of mu that comes from:
real temp_Intercept = atan2(mu_vec[2], mu_vec[1]);
in the transformed parameters block.

The mu_vec replacement of temp_Intercept remains the one thing that I can’t get into a brms custom_family. Reading the rest of the original thread it seems there may be good reasons not to allow distributional parameters to be overwritten/replaced by transformed parameters, but I’m afraid this discussion goes over my head.

To keep this a brms question, I suppose my question should now be:

“Is it permissible (or wise) to make a brms custom_family in which parameters are estimated via a conversion in the transformed parameters block?”

, before asking whether it can be done.

I’m not sure I understand this well enough to write it in mathematical notation, but I can try to describe what I think is going on. If I’ve now understood correctly, the idea is to have two distributional parameters: a unit_vector[2], which replaces temp_Intercept, and an intercept for kappa (as in brms default von Mises). The unit_vector is then transformed to the temp_Intercept for mean angle in the transformed parameters block (as above), which is then used in the model block. The advantages of this transform are:

  1. Estimates of mean angle near one another on the circle (e.g. -3 & 3) have unit vector equivalents that are close in value on a linear scale (i.e. (-0.14, -0.99) & (0.14, -0.99) ), producing good convergence.
  2. Estimation of mean angle as a unit_vector avoids mean angle estimates that exit the -π–π interval

At least from I practical point of view I can confirm that doing this gives the right answers in simulation (much better than in my earlier post, provided something sensible is then done with the random effects on mean angle).

I’ll attach my current stan code for reference. handwritten.uvMEmises6.stan (6.4 KB)

Sorry for my late reply. I am still having a hard time understanding all the details of what you are trying to do, but my current best guess it that this is something best done in Stan directly. As a rule of thumb: If you canot do the transforms inside the custom family function (i.e. *_lpdf etc.) than you will probably have a hard time getting it in brms in some sort of natural way. This is then a good indication that you should be switiching to Stan itself.

@Foztarz did you finally manage to implement a working unit vector version of mixed von mises with brms? I would like to do the same.

@Satya_Arjunan Sorry for the delayed reply. I had a bit of a career break while waiting for the grant for this project, which I’m just getting back to now.

The short answer would be “yes, sort of”. As a quick and dirty approach to see if I could get this working at all, I used brms to write a model with the correct structure for the data, then edited the parameters and transformed parameters blocks to introduce the unit-vector-derived population mean angle. Up to this point I’m fairly confident I’m just copying the original post I referenced.

What I’m not so sure about is how to specify the random effects on individual mean angle. TBH, I wasn’t sure how to set up a unit_vector for each individual (if that would even be advisable), so instead I made a vector of “arcs” between the population mean and each individual mean, scaled as a fraction of the ±arc from the population mean containing all individual means. While this worked well for the dataset I was trying to describe, and the parameter recovery looked OK on some small simulations, I really can’t say anything more about the validity of this approach.

I’ve put the code here in case it’s of use to anyone:

3 Likes

Thanks @Foztarz. I ended up implementing a C++ version of unit vector von Mises lpdf myself. I am using it with rstan. Here is the code:

#include <stan/model/model_base.hpp>
#include <stan/math/prim/meta.hpp>
#include <stan/math/prim/scal/err/check_consistent_sizes.hpp>
#include <stan/math/prim/scal/err/check_finite.hpp>
#include <stan/math/prim/scal/err/check_positive_finite.hpp>
#include <stan/math/prim/scal/fun/log_modified_bessel_first_kind.hpp>
#include <stan/math/prim/scal/fun/modified_bessel_first_kind.hpp>
#include <stan/math/prim/scal/fun/constants.hpp>
#include <stan/math/prim/scal/fun/value_of.hpp>
#include <stan/math/prim/scal/fun/size_zero.hpp>
#include <cmath>

#include <boost/math/tools/promotion.hpp>

using namespace stan;

template <bool propto, typename T_y, typename T_mu, typename T_kappa>
typename boost::math::tools::promote_args<T_y, T_mu, T_kappa>::type
unit_vector_von_mises_lpdf(
         const Eigen::Matrix<T_y, Eigen::Dynamic, Eigen::Dynamic>& y,
         const Eigen::Matrix<T_mu, 1, Eigen::Dynamic>& mu,
         const Eigen::Matrix<T_kappa, Eigen::Dynamic, 1>& kappa,
         std::ostream* pstream__) { 
  if (y.rows() == 0) {
    return 0;
  }

  typedef typename stan::partials_return_type<
    Eigen::Matrix<T_y, Eigen::Dynamic, Eigen::Dynamic>,
    Eigen::Matrix<T_mu, 1, Eigen::Dynamic>,
    Eigen::Matrix<T_kappa, Eigen::Dynamic, 1>>::type T_partials_return;

  T_partials_return logp(0);

  const bool y_const = is_constant_all<
    Eigen::Matrix<T_y, Eigen::Dynamic, Eigen::Dynamic>>::value;
  const bool mu_const = is_constant_all<
    Eigen::Matrix<T_mu, 1, Eigen::Dynamic>>::value;
  const bool kappa_const = is_constant_all<
    Eigen::Matrix<T_kappa, Eigen::Dynamic, 1>>::value;

  operands_and_partials<
    Eigen::Matrix<T_y, Eigen::Dynamic, Eigen::Dynamic>,
    Eigen::Matrix<T_mu, 1, Eigen::Dynamic>,
    Eigen::Matrix<T_kappa, Eigen::Dynamic, 1>> ops_partials(y, mu, kappa);

  for (size_t i = 0; i < y.rows(); i++) {
    Eigen::Matrix<T_y, Eigen::Dynamic, 1> y_row(y.row(i));
    const T_partials_return kappa_dbl = value_of(kappa(i));
    const T_partials_return mu_dot_y = value_of((mu*y_row)(0));

    if (include_summand<propto>::value) {
      logp -= LOG_TWO_PI;
    }
    if (include_summand<propto, T_kappa>::value) {
      logp -= log_modified_bessel_first_kind(0, kappa_dbl);
    }
    if (include_summand<propto, T_y, T_mu, T_kappa>::value) {
      logp += kappa_dbl*mu_dot_y;
    }

    if (!y_const) {
      for (size_t j = 0; j < length(mu); j++) { 
        ops_partials.edge1_.partials_vec_[i][j] += kappa_dbl*value_of(mu(j));
      }
    }

    if (!mu_const) {
      for (size_t j = 0; j < length(y_row); j++) { 
        ops_partials.edge2_.partials_vec_[i][j] += kappa_dbl*value_of(y_row(j));
      }
    }

    if (!kappa_const) {
      const T_partials_return bessel1 = 
        modified_bessel_first_kind(-1, kappa_dbl);
      const T_partials_return bessel0 = 
        modified_bessel_first_kind(0, kappa_dbl);
      ops_partials.edge3_.partials_[i] += mu_dot_y - bessel1/bessel0;
    }
  }
  return ops_partials.build(logp);
}

template <bool propto, typename T_y, typename T_mu, typename T_kappa>
typename boost::math::tools::promote_args<T_y, T_mu, T_kappa>::type
unit_vector_von_mises_lpdf(
                           const Eigen::Matrix<T_y, 1, Eigen::Dynamic>& y,
                           const Eigen::Matrix<T_mu, 1, Eigen::Dynamic>& mu,
                           const T_kappa& kappa,
                           std::ostream* pstream__) {

  using T_partials_return = partials_return_t<
    Eigen::Matrix<T_y, 1, Eigen::Dynamic>,
    Eigen::Matrix<T_mu, 1, Eigen::Dynamic>,
    T_kappa>;

  if (size_zero(y, mu, kappa)) {
    return 0.0;
  }

  using std::log;

  T_partials_return logp = 0.0;

  if (!include_summand<propto, T_y, T_mu, T_kappa>::value) {
    return logp;
  }

  const bool y_const = is_constant_all<
    Eigen::Matrix<T_y, 1, Eigen::Dynamic>>::value;
  const bool mu_const = is_constant_all<
    Eigen::Matrix<T_mu, 1, Eigen::Dynamic>>::value;
  const bool kappa_const = is_constant_all<T_kappa>::value;

  operands_and_partials<
    Eigen::Matrix<T_y, 1, Eigen::Dynamic>,
    Eigen::Matrix<T_mu, 1, Eigen::Dynamic>,
    T_kappa> ops_partials(y, mu, kappa);

  Eigen::Matrix<T_y, Eigen::Dynamic, 1> y_row(y.transpose());
  const T_partials_return kappa_dbl = value_of(kappa);
  const T_partials_return mu_dot_y = value_of((mu*y_row)(0));

  if (include_summand<propto>::value) {
    logp -= LOG_TWO_PI;
  }
  if (include_summand<propto, T_kappa>::value) {
    logp -= log_modified_bessel_first_kind(0, kappa_dbl);
  }
  if (include_summand<propto, T_y, T_mu, T_kappa>::value) {
    logp += kappa_dbl*mu_dot_y;
  }

  if (!y_const) {
    for (size_t j = 0; j < length(y_row); j++) { 
      ops_partials.edge1_.partials_[j] += kappa_dbl*value_of(mu(j));
    }
  }
  if (!mu_const) {
    for (size_t j = 0; j < length(mu); j++) { 
      ops_partials.edge2_.partials_[j] += kappa_dbl*value_of(y_row(j));
    }
  }
  if (!kappa_const) {
    const T_partials_return bessel1 = modified_bessel_first_kind(-1, kappa_dbl);
    const T_partials_return bessel0 = modified_bessel_first_kind(0, kappa_dbl);
    ops_partials.edge3_.partials_[0] += mu_dot_y - bessel1/bessel0;
  }
  return ops_partials.build(logp);
}


template <typename T_y, typename T_mu, typename T_kappa>
inline typename boost::math::tools::promote_args<T_y, T_mu, T_kappa>::type
unit_vector_von_mises_lpdf(const Eigen::Matrix<T_y, 1, Eigen::Dynamic>& y,
             const Eigen::Matrix<T_mu, 1, Eigen::Dynamic>& mu,
             const T_kappa& kappa, std::ostream* pstream__) {
  return unit_vector_von_mises_lpdf<false>(y, mu, kappa, pstream__);
}


2 Likes