Vectorizing integrate_1d

BFiles · September 13, 2021, 4:37pm

I find myself computing an integral (using integrate_1d) of the same function with several different upper limits. I would like to speed this process up.

To be concrete, here is a code snippet:

for (i in 1:n_trial) {
  log_integ[i] = log(integrate_1d(integrand, lb, t_stim[i], {mu_antic, sig_antic}, {0.0}, {0}));
}

Here, lb and t_stim are data, and mu_antic and sig_antic are scalar parameters. In case it matters, the integrand function is a simple wrapper around normal_cdf().

Intuitively, it seems like I ought to be able to save on computation somehow, but I’m not familiar enough with the iterative quadrature algorithm to know for sure.

Initially, I was thinking I could sort t_stim and do something like trapezoid integration using the points of t_stim plus any additional points to ensure some maximum distance between evaluation points. But that feels very unsophisticated compared to what integrate_1d is doing.

I suppose I could use the built-in integrator, but instead of the current approach, I could just compute the integral between subsequent pairs of t_stim points. This would mean calling integrate_1d just as many times, but each call might be faster because it’s over a smaller range of x values (so a lower resolution approximation would still be good).

Update: I did a quick test of this sorting approach, and it cut runtime in half. So that’s a pretty good win. I’d still be happy to know if anyone has suggestions for other approaches to try.

The code I used to test the relative speed

functions {
    real integrand(real x, real xc, real[] theta, real[] x_r, int[] x_i) {
        return normal_cdf(x, theta[1], theta[2]);
    }
}
data {
    int<lower=0> N;
    vector<lower=0>[N] x;
    real mu;
    real<lower=0> sigma;
}
parameters {}
model {}
generated quantities {
    vector[N] method_one;
    vector[N] method_two;
    
    // Approach 1, the straightforward way.
    profile ("Method 1") {
        for (i in 1:N) {
            method_one[i] = log(integrate_1d(integrand, 0, x[i], {mu, sigma}, {0.0}, {0}));
        }
    }
    
    // Approach 2, sort then increment
    profile ("Method 2") {
        array[N] int x_idx = sort_indices_asc(x);
        vector[N] sx = x[x_idx];
        vector[N] sorted_m2;
        for (i in 1:N) {
            real lb;
            real ub = sx[i];
            real increment;
            real prev_val;
            if (i==1) {
                lb = 0;
                prev_val = 0;
            } else {
                lb = sx[i-1];
                prev_val = sorted_m2[i-1];
            }
            increment = integrate_1d(integrand, lb, ub, {mu, sigma}, {0.0}, {0});
            sorted_m2[i] = prev_val + increment;
        }
        method_two[x_idx] = log(sorted_m2);
    }
}

martinmodrak · September 19, 2021, 10:53am

That’s a good trick. Another approach that I think might work (but I’ve just thought about it for roughly 30 seconds, so hope it’s not missing something basic) is that you could reformulate this as an ODE, i.e. if you have:

y_i = \int_a^{b_i} f(t) \mathrm{d}t

Then you can define g such that:

\frac{\partial g}{\partial t} = f(x) \\ g(a) = 0

Unless I am mistaken, you then have y_i = g(b_i). And you could use Stan’s ODE solver to solve g for all b_i in one call, which may actually reuse some computation. But the ODE solver tends to be more expensive than the integrator, so whether this will be faster than your approach is anyone’s guess…

Best of luck with your model!

Topic		Replies	Views
Advice on optimizing my use of integrate_1d()? Modeling techniques	4	397	April 12, 2023
Integrate function Developers	5	1558	April 6, 2017
1D integrator Stan model very slow Modeling pystan , mac , fitting-issues , astronomy , integrator	3	689	September 9, 2022
Using integrate1d in the function block Modeling	3	589	November 11, 2019
Caching results of repeated evaluations of integrate_1d in custom distribution? Modeling performance	6	3600	September 16, 2021

Vectorizing integrate_1d

Related topics