Problem implementing Zipf (power law) distribution -- external C++

maxbiostat · February 2, 2019, 7:12pm

I’m trying to implement a poor man’s version of the request in this issue. Here’s my stan code:

functions{
    real zeta(real s);
    real power_law_lpmf(int x, real alpha){
     return (-alpha * log(x) - log(zeta(alpha)) );
    }
    /*TODO: implement rng for prior and posterior predictive checks*/
}
data{
  int<lower=0> K; // number of unique values  
  int values[K];
  int<lower=0> frequencies[K]; 
}
parameters{
  real <lower=1> alpha;
}
model{
  for (k in 1:K) target += frequencies[k] * power_law_lpmf(values[k] | alpha);
}

And here is my R code:

plaw.model <- stan_model(file = "stan/discrete_power_law.stan", allow_undefined = TRUE,
                         boost_lib = "/usr/include/boost/",
                         includes = "/usr/include/boost/math/special_functions/zeta.hpp")

with output

Error in compileCode(f, code, language = language, verbose = verbose) : 
  Compilation ERROR, function(s)/method(s) not created! In file included from /home/max/R/x86_64-pc-linux-gnu-library/3.5/RcppEigen/include/Eigen/Core:392:0,
                 from /home/max/R/x86_64-pc-linux-gnu-library/3.5/RcppEigen/include/Eigen/Dense:1,
                 from /home/max/R/x86_64-pc-linux-gnu-library/3.5/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:4,
                 from /home/max/R/x86_64-pc-linux-gnu-library/3.5/StanHeaders/include/stan/math/rev/mat/fun/Eigen_NumTraits.hpp:4,
                 from /home/max/R/x86_64-pc-linux-gnu-library/3.5/StanHeaders/include/stan/math/rev/core/matrix_vari.hpp:4,
                 from /home/max/R/x86_64-pc-linux-gnu-library/3.5/StanHeaders/include/stan/math/rev/core.hpp:14,
                 from /home/max/R/x86_64-pc-linux-gnu-library/3.5/StanHeaders/include/stan/math/rev/mat.hpp:4,
                 from /home/max/R/x86_64-pc-linux-gnu-library/3.5/StanHeaders/include/stan/math.hpp:4,
                 from /home/max
In addition: Warning message:
In system(cmd, intern = !verbose) :
  running command '/usr/local/lib/R/bin/R CMD SHLIB file2c8d1ddbd8bf.cpp 2> file2c8d1ddbd8bf.cpp.err.txt' had status 1
Error in sink(type = "output") : invalid connection

I strongly suspect this might be related to this bug, but how do I solve it?

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.1 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so

locale:
 [1] LC_CTYPE=pt_BR.UTF-8       LC_NUMERIC=C               LC_TIME=pt_BR.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=pt_BR.UTF-8    LC_MESSAGES=en_GB.UTF-8    LC_PAPER=pt_BR.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=pt_BR.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rstan_2.18.2         StanHeaders_2.18.0-1 ggplot2_3.1.0        poweRlaw_0.70.2     

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0         pillar_1.3.1       compiler_3.5.1     plyr_1.8.4         bindr_0.1.1        prettyunits_1.0.2 
 [7] tools_3.5.1        pkgbuild_1.0.2     tibble_1.4.2       gtable_0.2.0       pkgconfig_2.0.2    rlang_0.3.0.1     
[13] cli_1.0.1          rstudioapi_0.8     yaml_2.2.0         parallel_3.5.1     loo_2.0.0          bindrcpp_0.2.2    
[19] gridExtra_2.3      withr_2.1.2        dplyr_0.7.8        stats4_3.5.1       grid_3.5.1         tidyselect_0.2.5  
[25] glue_1.3.0         inline_0.3.15      R6_2.3.0           processx_3.2.1     VGAM_1.0-6         purrr_0.2.5       
[31] callr_3.1.0        magrittr_1.5       matrixStats_0.54.0 scales_1.0.0       ps_1.2.1           splines_3.5.1     
[37] assertthat_0.2.0   colorspace_1.3-2   lazyeval_0.2.1     munsell_0.5.0      crayon_1.3.4

bgoodri · February 2, 2019, 7:32pm

Call stan_model with verbose = TRUE to see the real error message.

maxbiostat · February 2, 2019, 7:45pm

That returns a gigantic stream of output. What should I grep for?

$ grep error debuglog.txt 
  53 :         throw std::runtime_error("*** IF YOU SEE THIS, PLEASE REPORT A BUG ***");
 164 :             throw std::runtime_error("*** IF YOU SEE THIS, PLEASE REPORT A BUG ***");
 182 :             throw std::runtime_error("variable alpha missing");
 191 :             throw std::runtime_error(std::string("Error transforming variable alpha: ") + e.what());
 255 :             throw std::runtime_error("*** IF YOU SEE THIS, PLEASE REPORT A BUG ***");
 334 :             throw std::runtime_error("*** IF YOU SEE THIS, PLEASE REPORT A BUG ***");
file36aa2927b1c5.cpp:73:1: error: expected unqualified-id before ‘/’ token
file36aa2927b1c5.cpp:400:57: error: ‘model36aa26ddf5ed_discrete_power_law’ in namespace ‘model36aa26ddf5ed_discrete_power_law_namespace’ does not name a type
file36aa2927b1c5.cpp:407:80: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:407:80: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:408:41: error: template argument 1 is invalid
file36aa2927b1c5.cpp:408:43: error: template argument 1 is invalid
file36aa2927b1c5.cpp:410:22: error: expected primary-expression before ‘,’ token
file36aa2927b1c5.cpp:410:28: error: expected primary-expression before ‘,’ token
file36aa2927b1c5.cpp:410:34: error: expected primary-expression before ‘>’ token
file36aa2927b1c5.cpp:410:36: error: expected primary-expression before ‘)’ token
file36aa2927b1c5.cpp:413:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:413:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:413:141: error: template argument 1 is invalid
file36aa2927b1c5.cpp:415:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:415:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:415:141: error: template argument 1 is invalid
file36aa2927b1c5.cpp:417:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:417:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:417:141: error: template argument 1 is invalid
file36aa2927b1c5.cpp:419:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:419:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:419:141: error: template argument 1 is invalid
file36aa2927b1c5.cpp:421:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:421:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:421:141: error: template argument 1 is invalid
file36aa2927b1c5.cpp:423:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:423:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:423:141: error: template argument 1 is invalid
file36aa2927b1c5.cpp:425:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:425:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:425:141: error: template argument 1 is invalid
file36aa2927b1c5.cpp:427:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:427:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:427:141: error: template argument 1 is invalid
file36aa2927b1c5.cpp:429:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:429:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:429:141: error: template argument 1 is invalid
file36aa2927b1c5.cpp:431:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:431:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:431:141: error: template argument 1 is invalid
file36aa2927b1c5.cpp:433:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:433:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:433:141: error: template argument 1 is invalid
file36aa2927b1c5.cpp:435:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:435:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:435:141: error: template argument 1 is invalid
file36aa2927b1c5.cpp:437:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:437:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:437:141: error: template argument 1 is invalid
file36aa2927b1c5.cpp:439:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:439:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:439:141: error: template argument 1 is invalid
file36aa2927b1c5.cpp:441:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:441:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:441:141: error: template argument 1 is invalid
file36aa2927b1c5.cpp:443:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:443:78: error: ‘model36aa26ddf5ed_discrete_power_law’ is not a member of ‘model36aa26ddf5ed_discrete_power_law_namespace’
file36aa2927b1c5.cpp:443:141: error: template argument 1 is invalid
ERROR(s) during compilation: source code errors or compiler configuration errors!
 53:         throw std::runtime_error("*** IF YOU SEE THIS, PLEASE REPORT A BUG ***");
164:             throw std::runtime_error("*** IF YOU SEE THIS, PLEASE REPORT A BUG ***");
182:             throw std::runtime_error("variable alpha missing");
191:             throw std::runtime_error(std::string("Error transforming variable alpha: ") + e.what());
255:             throw std::runtime_error("*** IF YOU SEE THIS, PLEASE REPORT A BUG ***");
334:             throw std::runtime_error("*** IF YOU SEE THIS, PLEASE REPORT A BUG ***");

The stuff immediately above ERROR(s) says

/home/max/R/x86_64-pc-linux-gnu-library/3.5/RcppEigen/include/Eigen/src/Core/CoreEvaluators.h:960:8: warning: ignoring attributes on template argument ‘Eigen::internal::packet_traits<double>::type {aka __vector(2) double}’ [-Wignored-attributes]
   enum {
        ^
make: *** [file36aa2927b1c5.o] Error 1

Full log here.

bgoodri · February 2, 2019, 9:03pm

You can’t just wrap Boost functions because of the hidden std::ostream* pstream__ argument and also because Boost functions tend not to work with stan::math::var arguments. So, you need a

#include <boost/math/special_functions/zeta.hpp>

inline
stan::math::var zeta(const stan::math::var& s, std::ostream* pstream__) {
  double s_ = value_of(s);
  double f = boost::math::zeta(s_);
  double d = 0;
  double k = 2;
  double term = 1;
  // http://functions.wolfram.com/ZetaFunctionsandPolylogarithms/Zeta/20/01/01/0001/MainEq1.L.gif
  while (term > 1e-16) {
    term = log(k) / pow(k, s_);
    d -= term;
    k++;
  }
  return var(new precomp_v_vari(f, s.vi_, d));
}

inline
double zeta(const double& s, std::ostream* pstream__) {
  return boost::math::zeta(s);
}

that does the derivative instead of using autodiff. And then call it with

plaw.model <- stan_model(file = "discrete_power_law.stan",  allow_undefined = TRUE, 
  includes = paste0('\n#include "', file.path(getwd(), 'zeta.hpp'), '"\n'), verbose = TRUE)

except I get a linker error that I haven’t been able to figure out.

maxbiostat · February 2, 2019, 9:31pm

Thanks for this. When you say you get an error, is it visible from the output? Because when I run it on my machine I get the same dyn.load() error (can’t obviously load a binary that hasn’t been made) but the grepping for error in the large stream from verbose = TRUE does not return any relevant hits.

maxbiostat · February 4, 2019, 1:06pm

@bgoodri, sorry to bother, but is there anything else I can do/try here?

Since the discrete stuff was going nowhere, I moved on to implementing a continuous function with some bells and whistles.
Said bells and whistles rely on boost’s exp_sinh quadrature.

I got a similar error (or lack thereof):

cplaw.diff.model <- stan_model(file = "stan/continuous_power_law_diff.stan", allow_undefined = TRUE,
                               verbose = TRUE,
                               includes =  paste0('\n#include "', 
                                                  file.path(getwd(), 'stan/diff_function_lconstant.hpp'), '"\n')
)
...
/home/max/R/x86_64-pc-linux-gnu-library/3.5/StanHeaders/include/src/stan/mcmc/hmc/hamiltonians/dense_e_metric.hpp:24:60:   required from ‘double stan::mcmc::dense_e_metric<Model, BaseRNG>::T(stan::mcmc::dense_e_point&) [with Model = model533674b696f9_continuous_power_law_diff_namespace::model533674b696f9_continuous_power_law_diff; BaseRNG = boost::random::additive_combine_engine<boost::random::linear_congruential_engine<unsigned int, 40014, 0, 2147483563>, boost::random::linear_congruential_engine<unsigned int, 40692, 0, 2147483399> >]’
file5336289e086.cpp:586:1:   required from here
/home/max/R/x86_64-pc-linux-gnu-library/3.5/RcppEigen/include/Eigen/src/Core/DenseCoeffsBase.h:55:30: warning: ignoring attributes on template argument ‘Eigen::internal::packet_traits<double>::type {aka __vector(2) double}’ [-Wignored-attributes]
Error in dyn.load(libLFile) : 
  unable to load shared object '/tmp/Rtmpx3dE1l/file5336289e086.so':
  /tmp/Rtmpx3dE1l/file5336289e086.so: undefined symbol: _ZN53model533674b696f9_continuous_power_law_diff_namespace23diff_function_lconstantIN4stan4math3varES3_S3_dEEN5boost4math5tools12promote_argsIT_T0_T1_T2_ffE4typeERKS8_RKS9_RKSA_RKSB_PSo

with the code in diff_function_lconstant.hpp being

#include <boost/math/quadrature/exp_sinh.hpp>

double diff_function_lconstant(const double& alpha, const double& z,
                                   const double& w, const double& m,  std::ostream* pstream__) {
    double inf = std::numeric_limits<double>::infinity();
    boost::math::quadrature::exp_sinh<double> integrator;
    auto f = [&](double x) {
      return std::pow(x, -alpha)  *  std::pow(z, x-1) * std::pow(w, std::pow(x-1, 2)); 
    };
    double termination = std::pow(std::numeric_limits<double>::epsilon(), 0.5);
    double error;
    double L1;
    size_t levels;
    double Q = integrator.integrate(f, m, inf, termination, &error, &L1, &levels);
    return std::log(Q);
}

I can confirm the code works because I’ve been able to play round with it using Rcpp and the Source pane in Rstudio.
I’ll omit the Stan code as I don’t think it’s relevant here.

Thanks

bgoodri · February 4, 2019, 2:14pm

This means that it cannot link a function with the signature
boost::math::tools::promote_args<stan::math::var, stan::math::var, stan::math::var, double, float, float>::type model533674b696f9_continuous_power_law_diff_namespace::diff_function_lconstant<stan::math::var, stan::math::var, stan::math::var, double>(stan::math::var const&, stan::math::var const&, stan::math::var const&, double const&, std::basic_ostream<char, std::char_traits<char> >*)
because you have only implemented the diff_function_lconstant when everything is a double. You need a template specialization when some or all of the arguments besides pstream__ is a var, in which case the returned value is also a var and you have to deal with the derivatives. The easiest way to do that is to be calling integrate_1d in the Stan Math library.

maxbiostat · February 4, 2019, 2:22pm

Any examples I can follow? I’m not fluent in C++ and the concept of template specialisation goes over my head. But with a handy example, monkey could see and do.

I’m assuming this double/var problem is also at the heart of the zeta example above. Is that right?

bgoodri · February 4, 2019, 3:42pm

In the vignette, I had to implement both versions of the sinc function:

github.com

stan-dev/rstan/blob/develop/rstan/rstan/vignettes/sinc.hpp

double
sinc(const double& x, std::ostream* pstream__) {
  return x != 0.0 ? sin(x) / x : 1.0;
}

var
sinc(const var& x, std::ostream* pstream__) {
  double x_ = x.val();
  double f = x_ != 0.0 ? sin(x_) / x_ : 1.0;
  double dfdx_ = x_ != 0.0 ? (cos(x_) - sin(x_)) / x_ : 0.0;
  return var(new precomp_v_vari(f, x.vi_, dfdx_));
}

If you call stan_model with verbose = TRUE, it will tell you the signature that you need for diff_function_lconstant, which will look something like

template<typename T0__, typename T1__, typename T2__, typename T3__>
boost::promote_args(...)
diff_function_lconstant(const T0__&, const T1__&, const T2__&, const T3__&,
                                    std::ostream* pstream__);

Basically, if any of alpha, z, w, or m are unknown then the result needs to be a var and you need to handle its derivatives with respect to the unknowns either analytically or with Stan’s reverse mode autodiff mechanism.

maxbiostat · February 4, 2019, 3:50pm

Thanks a lot, Ben. This is very helpful. I’ll give a shot. I don’t want to write the derivatives by hand because since my original function is an integral, I’d have to differentiate under the integral sign and hope that this new integral is available in closed-form. Otherwise, computing each partial derivative would entail running a separate quadrature, at which point I’m not sure it’d not be better to simply let Stan handle stuff via autodiff.

I struggle to get this sort of information from the stream of output in Rstudio. To me it just appears as a mangled mess. Do you have any tips for how to dig out this sort of info?

seantalts · February 4, 2019, 3:58pm

@maxbiostat thanks for taking a crack at this! It’s doable to use RStan as Ben does, but
a bunch of us that work on just the Math library will typically try to start writing unit tests directly in the Math library. When we need to check integration, we’ll set up and use CmdStan, a much thinner wrapper with fewer layers of abstraction to muddle through when diagnosing errors. This wiki page should go fairly in-depth into this process: https://github.com/stan-dev/stan/wiki/Contributing-New-Functions-to-Stan, and it’s probably worth also checking this little wiki page out too: https://github.com/stan-dev/math/wiki/Introduction-to-Stan-Math-for-New-Developers.

maxbiostat · February 4, 2019, 4:05pm

Thanks, Sean. Just to be clear, I wasn’t planning on fixing the issue linked in the OP, but I can try and do it if somebody would take my hand. As I said, not fluent in C++. At all. But I love Stan and would love to contribute more than just pesky bug reports.

@bgoodri, I know we should wait to fall from the bridge only when we get to it, but let me ask this now anyways: would the code you posted for the derivatives of the zeta function actually be accurate? If you try to compute the function that way, it won’t give you the right answer, since for many values of s it’ll converge much too slowly. Now, for the derivative I don’t know, but suspect that’d also not be very accurate. Have you seen other implementations that do this?

The second example Ben and I discussed would not be worth adding to the math library, as it’s a very niche thing.

bgoodri · February 4, 2019, 5:28pm

I haven’t tested how accurate the derivative of the zeta function would be but log(k) increases slower than does k^s so the infinite sum should converge fairly quickly unless s is very close to 1.

maxbiostat · February 4, 2019, 8:55pm

I don’t think that intuition is particularly accurate. I could try and make a mathematical argument (and I might in the future if you want) but for now I just coded up in R:

zeta_deriv <- function(s){
  ans <- 0
  k <- 2
  term <- 1
    while (term > 1e-16) {
      term <- log(k) / (k^s);
      ans <- ans - term;
      k <- k + 1;
      cat("k = ", k, " d = ", ans, "\n")
    }
  cat("needed ", k, " iterations \n")
  return(ans)
}
zeta_deriv(2)

I had to stop this out of boredom, but here is the “tail”

k =  1347315  d =  -0.937537 
k =  1347316  d =  -0.937537 
k =  1347317  d =  -0.937537

and the “right” result would be -0.9375482543….

For the Zipf distribution in particular, we’d be looking at values for s of at most 10, and some reasonably close to 1 or 2. deriv_zeta(4) takes 17686 iterations, for instance.

Do you agree this might not be a good way to code up the derivative? Or am I not getting something?

bgoodri · February 4, 2019, 10:10pm

Yeah, that might be bad. You might try writing a log_zeta function as -\sum_{p \in \mbox{primes}}\ln \left(1 - p^{-s}\right) and evaluate it over some huge array of prime numbers.

maxbiostat · February 5, 2019, 8:07pm

OK, still just probing instead of actually working on the issue. If I said I had a way of evaluating the derivative of \zeta that involves an exp_sinh quadrature call [and a polygamma call, but I’m assuming that’s fine], would you say “just let Stan do the autodiff” [however that may be implemented] or is it viable?

Here’s the maths in case you’re interested:
First, we’ll need the integral representation of the zeta function, obtained via Laplace transform:

\zeta(s) = \frac{1}{\Gamma(s)}\int_{0}^\infty \frac{x^{s-1}}{e^x -1}dx,

and hence

\zeta^\prime(s) = \frac{1}{\Gamma(s)}\int_0^\infty \frac{x^{s-1}\left(\log(x) - \psi_0(s) \right)}{e^x -1}dx,

where \psi_0(s) is the polygamma function.
Accompanying R code:

zeta_int <- function(s){
  func <- function(x){
    x^(s-1)/(exp(x) - 1)
  }
  func <- Vectorize(func)
  return(integrate(func, 0, Inf)$value/gamma(s))
}
zeta_deriv <- function(s){
  k <- digamma(s)
  func <- function(x){
    (x^(s-1) *(log(x) - k))/(exp(x) -1)
  }
  func <- Vectorize(func)
  ans <- integrate(func, 0, Inf)$value/gamma(s)
  return(ans)
}
####
# Testing
ss <- 3/2
zeta_int(ss)
VGAM::zeta(ss)
zeta_deriv_alt(ss)
VGAM::zeta(ss, deriv = 1)

bgoodri · February 5, 2019, 10:59pm

Yeah, I might start with “let integrate_1d autodiff that”, but if you look at the code @bbbales2 wrote, what it does is first integrate the function in doubles and then integrate the function where one parameter is a var and everything else is a double, and then does that for each of the parameters. So, if you were to translate that R code to C++, you could do it somewhat more efficiently by integrating both functions in doubles with exp_sinh quadrature.

maxbiostat · February 6, 2019, 2:38am

Right. Here’s the latest iteration of zeta.hpp, for inclusion using the Rstan mechanism:

#include <boost/math/special_functions/zeta.hpp>
#include <boost/math/special_functions/digamma.hpp>
#include <boost/math/special_functions/gamma.hpp>
#include <boost/math/quadrature/exp_sinh.hpp>
using boost::math::isfinite;

inline double zeta(const double& s, std::ostream* pstream__){
  return boost::math::zeta(s);
}

inline var zeta(const var& s_var, std::ostream* pstream__) {
  double s = s_var.val();
  double f = zeta(s, pstream__);
  boost::math::quadrature::exp_sinh<double> integrator;
  auto int_rep_zeta_prime = [&](double x) {
    double ans = std::pow(x, s-1) *( std::log(x) - boost::math::digamma(s))/(std::exp(x) -1);
    if(!isfinite(ans)) return 0.0; // weird patching to get quadrature working
    return ans;
  };
  double tolerance = std::sqrt(std::numeric_limits<double>::epsilon());
  double error = 0.0;
  double L1 = 0.0;
  size_t levels;
  double deriv = integrator.integrate(int_rep_zeta_prime, tolerance, &error, &L1, &levels)/boost::math::tgamma(s);
  return var(new precomp_v_vari(f, s_var.vi_, deriv));
}

I don’t actually use quadrature to evaluate zeta because there’s no need (unless the implementation of zeta in boost is jinxed). Gives me that (templating?) error:

Error in dyn.load(libLFile) : 
  unable to load shared object '/tmp/RtmpUaG4TC/file4c872949fc9c.so':
  /tmp/RtmpUaG4TC/file4c872949fc9c.so: undefined symbol: _ZN45model4c87906d8bc_discrete_power_law_namespace4zetaIN4stan4math3varEEEN5boost4math5tools12promote_argsIT_fffffE4typeERKS8_PSo
Calls: stan_model ... cxxfunctionplus -> <Anonymous> -> cxxfunction -> dyn.load
Execution halted

The actual output with verbose = TRUE does not contain any errors that I could find.
@bbbales2, @Bob_Carpenter @seantalts what am I missing?

EDIT: The derivative was missing a \frac{1}{\Gamma(s)} term and the integral representation needed patching to work with exp_sinh. Shouldn’t affect any of the problems discussed.

seantalts · February 6, 2019, 3:15am

The people you tagged probably can’t help you with RStan. Try CmdStan as mentioned above - it’s an easier environment to debug these things in.

bbbales2 · February 6, 2019, 3:20am

I’m really not sure. Since it’s failing with linkage, my two wild guesses would be:

Get rid of the inline keywords on both functions? inline affects linkage, but I think we want inline here? So I’m not sure
Do things with templating and see if the linking changes

template<typename T>
inline double zeta(const T& s, std::ostream* pstream__){
  return boost::math::zeta(s);
}

template<>
inline var zeta(const var& s_var, std::ostream* pstream__) {
  ...
}

I think it’s saying zeta isn’t defined.

Topic		Replies	Views
Compilation error in one dimensional integration in Rstan RStan	25	1590	May 7, 2019
Call GNU Scientific Library from stan Modeling	25	2938	May 24, 2018
Any simple working example of external C++ lpdf function? RStan example-models	19	1433	September 23, 2020
Vignette for interfacing with external C++ code via rstan Developers features	9	1570	December 8, 2016
Warnings when compiling CmdStan code on linux/arm64 CmdStan installation	1	43	November 20, 2024

Problem implementing Zipf (power law) distribution -- external C++

Related topics