Speeding up a spherical harmonic model

DouglasBoubert · February 23, 2021, 11:11am

I am attempting to optimise a model which uses spherical harmonics to parameterise the success probability of binomials on the sphere. Transforming from the alm parameters to the logit-probability x involves two steps which are computationally expensive (the bracketed lines in the transformed parameters block). This operation will be the building block of a much more complicated model, so I’d like to check that I haven’t missed a trick. Can anyone suggest a different way of calculating this expression which is more Stan-friendly?

data {
    int<lower=0> J;        // number of pixels
    int<lower=0> N;        // number of HEALPix isolatitude rings
    int<lower=0> H;        // number of harmonics
    int<lower=0> L;        // 2 * max l of hamonics + 1
    matrix[N,H] lambda;    // spherical harmonics decomposed
    matrix[J,L] azimuth;   // spherical harmonics decomposed
    int indices[J];        // map N->J
    int lower[L];          // map H->L
    int upper[L];          // map H->L
    int k[J];
    int n[J];
}
parameters {
    vector[H] alm;
}
transformed parameters {
    vector[J] x;
    {
        matrix[N,L] F;
        for (i in 1:L) {
            F[:,i] = lambda[:,lower[i]:upper[i]] * alm[lower[i]:upper[i]];
        }
        x = rows_dot_product(F[indices],azimuth);
    }
}
model {
    alm ~ std_normal();
    k ~ binomial_logit(n, x);
}

bbbales2 · February 24, 2021, 12:44pm

matrix[N,L] F;
for (i in 1:L) {
  F[:,i] = lambda[:,lower[i]:upper[i]] * alm[lower[i]:upper[i]];
}

The only thing that sticks out to me is would it be possible to zero-pad the lambda matrix and build a matrix out of alm so that you can do:

matrix[N, L] F = lambda * alm_matrix; // alm_matrix would be an HxL matrix

Matrix multiplication is more efficient than matrix vector multiplication. The gains to big matrix operations are big enough that usually you can afford some zero padding even if multiplying by zero seems like kindof a waste.

I can see how writing it like you have it is more efficient in terms of operations though

mike-lawrence · February 24, 2021, 5:56pm

As someone that doesn’t have much experience with the foundational compute side of things, things like this are super surprising and neat.

DouglasBoubert · February 24, 2021, 6:48pm

Thanks for taking a look! I tried something along the lines you suggested, swapping these lines:

for (i in 1:L) {
    F[:,i] = lambda[:,lower[i]:upper[i]] * alm[lower[i]:upper[i]];
}

for these lines:

matrix[H,L] Alm = rep_matrix(0.0, H, L);
for (i in 1:L) {
    Alm[lower[i]:upper[i],i] = alm[lower[i]:upper[i]];
}
F = lambda * Alm;

It turned out that my original code was the faster of the two, but that wasn’t obvious to me until I tried it!

Your original suggestion was to pad lambda, but I couldn’t find a way to make it work without turning it into a three dimensional object. Have I missed something?

bbbales2 · February 25, 2021, 1:47am

Nah nah, the second thing is what I meant. Hmm, if that isn’t any faster then I think I’m out of ideas. What you have looks good to me.

By the way, there is new profiling stuff in cmdstan 2.26: Profiling Stan programs with CmdStanR • cmdstanr . It can be handy for things like this so you know what are the expensive parts of the code.

DouglasBoubert · February 25, 2021, 8:37am

Amazing! Profiling has revealed that the line:

x = rows_dot_product(F[indices],azimuth);

takes six times longer than the lines I was trying to optimise.

rok_cesnovar · February 25, 2021, 9:00am

Nice!

Things like this are why we were so eager to add profiling.

DouglasBoubert · February 25, 2021, 9:02am

I really appreciate it. I can’t imagine how difficult that must have been to implement!

mike-lawrence · February 25, 2021, 2:06pm

Take a look at the rows in the rows dot product call to see if there’s any redundancy there. If so, see here on computing on just the unique rows then indexing into the result.

Topic		Replies	Views
Speed of matrix multiplication versus multiplication loops Modeling rstan , techniques , performance	4	1678	November 8, 2022
Improving the efficiency of *_dot_product for trinary-and-redundant matrices Modeling	38	1177	May 3, 2023
Looking for suggetions to make codes more efficient General rstan	4	184	April 28, 2024
Elementwise multiplications with row sums appear to be slow Modeling fitting-issues , specification	32	1171	August 23, 2021
How to vectorize this code? Modeling	3	475	September 3, 2020

Speeding up a spherical harmonic model

Related topics