How to vectorize when dealing with vectors of different lengths while partial pooling?

OBidz · August 10, 2020, 6:35pm

I’ve really been enjoying Stan since making the jump from JAGS. The R packages are now really mature and the Python ones are catching up quickly (thanks Arviz team!)

One thing that is still giving me issues is trying to vectorize code, particularly when I have vectors of different lengths. For example, in the model below I am trying to perform a simple univariate regression where I have data for x and y from two groups. So I have x, y and x_group vectors that are as long as the number of samples in the model (let’s call that N), and then I have J possible values for x_group, say 2 when there are groups 1 and 2. If I want my alpha and beta parameters in the model to differ by group, and I also want to partially pool these parameters, I might write something in loop form like this;

        part_code = """
    data {
        int<lower=1> J; // number of groups, 2 in this example
        int<lower=1> N; // number of samples, 20 in this example
        vector[N] y; // response variable
        vector[N] x; // predictor variable
        int x_group[N]; // the group ids for each sample
    }
    parameters {
        vector[J] beta;
        vector[J] alpha;
        real<lower=0> sigma;
        real mu_alpha;
        real mu_beta;
        real<lower=0> sigma_alpha;
        real<lower=0> sigma_beta;
    }
    transformed parameters {
        //placeholder but blank for now
    }
    model {
        vector[N] mu;
        
        //prior on sigma (error)
        sigma ~ exponential(1);
        
        //priors on alpha and beta
        mu_alpha ~ normal(0, 10);
        sigma_alpha ~ exponential(1);
        
        mu_beta ~ normal(0, 10);
        sigma_beta ~ exponential(1);
        
        //group level priors for alpha and beta
        for (j in 1:J) {
        alpha[j] ~ normal(mu_alpha, sigma_alpha);
        beta[j] ~ normal(mu_beta, sigma_beta);
        }
        
        //likelihood
        for (i in 1:N) {
        mu[i] = beta[x_group[i]] * x[i] + alpha[x_group[i]];
        y[i] ~ normal(mu[i], sigma);
        }
    }
    generated quantities {
        vector[N] log_lik;
        vector[N] y_hat;
        
        for (j in 1:N) {
            log_lik[j] = normal_lpdf(y[j] | beta[x_group[j]] * x[j] + alpha[x_group[j]], sigma);
            y_hat[j] = normal_rng(beta[x_group[j]] * x[j] + alpha[x_group[j]], sigma);
        }

    }
    """

Now I know it must be possible to pull the sampling of alpha and beta out of that loop over J while also pulling out that loop over N for the likelihood. However when I try to do that, particularly for the likelihood, I get errors saying that beta[x_group] * x won’t work because I can’t use the operator ’ * ’ between ‘ill-defined’ and x. I am fairly sure this is because beta is length J and x is length N. How would I go about vectorizing this? Note; I was able to vectorize when only fitting a group level intercept, I suspect the ’ + ’ operator is less fussy about this issue.

This is a bit of a contrived example, but I built it to better help me understand going from loops in JAGS to vectorized code in Stan. If anyone can help me I’d really appreciate it.

bbbales2 · August 10, 2020, 7:03pm

The elementwise-operators are probably what you want. Check out .* and ./: 6.5 Arithmetic and Matrix Operations on Expressions | Stan Reference Manual

OBidz · August 10, 2020, 8:53pm

That’s really helpful thank you. I noticed in the docs it says;

These provide a shorthand to replace loops, but are not intrinsically more efficient than a version programmed with an elementwise calculations and assignments in a loop.

Is there a better solution to make use of vectorization here or is this optimal?

bbbales2 · August 10, 2020, 8:56pm

For random effects, what you’re doing is good.

Either this:

for (i in 1:N) {
  mu[i] = beta[x_group[i]] * x[i] + alpha[x_group[i]];
}
y ~ normal(mu, sigma);

or this:

mu = beta[x_group] .* x + alpha[x_group];
y ~ normal(mu, sigma);

should be about the same for now. We’re working on some stuff now that should make the second faster, but it’s a work in progress.

OBidz · August 10, 2020, 8:59pm

Ok great. Appreciate the help and all the work you folks are doing on development!

bbbales2 · August 10, 2020, 9:15pm

Another thing, if you have a linear term in your model (not random effects), you should probably use https://mc-stan.org/docs/2_24/functions-reference/normal-id-glm.html

So if you have something like:

matrix[N, K] X;
vector[K] gamma;
...
mu = X * gamma + beta[x_group] .* x + alpha[x_group];
y ~ normal(mu, sigma);

Do:

mu = beta[x_group] .* x + alpha[x_group];
y ~ normal_id_glm(X, mu, gamma, sigma);

instead and it’s faster.

OBidz · August 10, 2020, 9:37pm

Interesting. So say I had a random intercept only model, would it go something like…;

mu = x + alpha[x_group];
y ~ normal_id_glm(X, mu, beta, sigma);

I guess what I’m asking is should the result of all the ‘random’ parts of the model go in the mu argument of normal_id_glm?

bbbales2 · August 10, 2020, 9:40pm

In the documentation notation it’s alpha, but yeah the second argument is for the random effects.

But the tricks here are on the matrix * vector part of the model. If you don’t have that part, I don’t think the glm thing will be faster.

OBidz · August 10, 2020, 9:49pm

Of course, the second argument is alpha, sorry.

So the strategy is to pop all the fixed parts of the model in that third argument to make use of the vectorization?

I’m learning a lot, thank you!

bbbales2 · August 11, 2020, 9:30pm

Yeah to take advantage of the GLM function you have to reduce the linear part of your model to be matrix * vector + vector. That second vector are the random effects.

Topic		Replies	Views
Problems fitting model with group-level variance. Problem with vectorization? Modeling	2	671	January 25, 2018
Help with structuring a (simple) random slope hierarchical model in Stan Modeling	8	1509	July 11, 2019
Partial pooling in generated quantities? Modeling	5	386	February 4, 2021
Defining an array of matrices (or vectors) of variable length Modeling specification	2	1305	October 27, 2020
Correct vectorization in hierarchical model Modeling techniques , specification	9	552	April 14, 2020

How to vectorize when dealing with vectors of different lengths while partial pooling?

Related topics