Is it more efficient to reparametrize to a beta if I'm estimating a 2-dimensional simplexe?

Hi all,

I’m using 2-dimensional simplexes to model monotonic predictors (as described in this paper by @paul.buerkner). I described the whole model in a previous question. It’s pretty big and efficiency is a concern. My idea is that since a 2-d simplexe can be re-expressed as \delta = [\alpha, 1 - \alpha] for some \alpha \in \{0, 1\}, then it might be simpler to only declare \alpha as the parameter of interest and \delta as an intermediate parameter.

Basically, a simplified version of my code looks like this :

parameters {
    vector[K] beta;
    array[K] simplex[2] delta;  
}
model {
    \\ independent uniform priors for all 2-d simplexes
    for (k in 1:K) {    
        zeta[k] ~ dirichlet(rep_vector(1, 2));
    }
   
    \\ Linear predictor
    phi = mo(delta, X) * beta;
    
    \\ Rest of the model...
}

Would something like this be more efficient :

parameters {
    vector[K] beta;
    vector<lower = 0, upper = 1>[K] alpha;  
}

model {
    \\ independent uniform priors for all 2-d simplexes
    array[K] vector[2] delta;
    for (k in 1:K) {    
        alpha[k] ~ beta(rep_vector(1, 2));
        delta[k] = [alpha[k], 1 - alpha[k]]; 
    }

    \\ Linear predictor
    vector[N] phi = mo(delta, X) * beta;
    
    \\ Rest of the model...
}

Note. The mo() function applies the monotonic transform to each column of X. It could probably also be optimize but I want to keep this post short.

They’re pretty much identical if you look at how a 2-dimensional simplex gets implemented.

  1. You don’t need to write down dirichlet([1, ..., 1]) or beta(1, 1) explicitly because they’re uniform and don’t add anything other than a constant to the log density.

  2. Stan’s underlying simplex implementation (see the transforms chapter of the reference manual), does exactly what you’ve coded—give delta[1] a uniform(0, 1) distribution and then set delta[2] = 1 - delta[1].

Also, you can just try both and see if there’s a speed difference you can detect (you can’t define beta the way you did—it needs two real-valued arguments, not a vector argument). It’s often hard to figure out what the C++ optimizer is going to do and sometimes things that look like the same code can have different performance. I don’t expect that here though.

1 Like