Inverse Function or 1/(*)

What is the difference between using 1/(a parameter) to denote another parameter and using the inverse function?


hb ← ’
//Number of strata
int<lower=2> K;

vector<lower=0>[K] n_e; //sample sizes
vector[K] y_bar_e; //sample means
vector<lower=0> [K]S_e; // sample variances

//pre-specified constants
real k_0;
real b_01;
real b_02;
real b_03;
real b_04;
real<lower=0> v_0;

//level 1
vector[K] theta_ek;
real<lower=0.001> sigma2_e;

//level 2
vector[K] mu_k;
vector[K] tau2_k_i;

//level 3
real mu;
real<lower=0> phi2_i;
real<lower=0> eta2_i;

// priors
eta2_i ~ gamma(b_03, b_04);
phi2_i ~ gamma(b_01, b_02);
mu_k ~ normal(2,sqrt(inv(phi2_i)));
tau2_k_i ~ gamma(v_0 0.5,v_0 0.5*inv(eta2_i));

theta_ek ~ normal(mu_k,sqrt(inv(tau2_k_i)));
sigma2_e ~ inv_gamma(.01,.01);

// likelihood
for (i in 1:K)
// Sample Means
target += normal_lpdf(y_bar_e[i]|theta_ek[i], sqrt(sigma2_e inv(n_e[i])));
//Sample Variances
target += gamma_lpdf(S_e[i]|n_e[i]
.5-.5,(n_e[i]-1)* inv(2.0*sigma2_e));


If I remember correctly I think inv(x) might be implemented to be a bit more numerically stable than 1/x but otherwise I think it’s the same.


Thank you Jonah! That makes sense

1 Like

I dug into this a little as I was interested to know how this might work, as far as I can see the inv function just calculates 1.0 / x though. Am I looking in the right places?

This is the “prim” version, which is called directly only if inv() is used with data/transformed data.

With parameters this is called for inv() math/inv.hpp at 92075708b1d1796eb82e3b284cd11e544433518e · stan-dev/math · GitHub

And with / this is called: math/operator_division.hpp at 92075708b1d1796eb82e3b284cd11e544433518e · stan-dev/math · GitHub

I think there is a slight difference in the gradient there, I would have to write it out to make sure.


Is it safe to say that inverse is faster with parameters because the gradient is written out? Though it seems really marginal as the number of ops for autodiff is maybe a couple more.

Edit: Ah yes, division operator also has the gradient. Anyway one can always profile the two to check.