Unable to integrate function obtained from algebraic solver

Hi there, I’m not being able to use Stan to fit a model where I have an integral of a function which has to be solved numerically, using Stan’s algebraic equation solver, that depends on the data, parameters and derived parameters.

I’ve thought about transforming the problem into a simpler form to show it here, but I’m afraid I miss some possibly relevant technical details, so I’ll show here the problem in it’s full form, and hopefully I’m able to make things clear.


The Model

Starting from the beginning my model has 3 parameters
\theta = \{h, \Omega_m,\Omega_r\},

and one derived parameter
\lambda = \frac{1}{2} + W_0 \left( \frac{\Omega_m + \Omega_r}{2e^{1/2}} \right).

My model tries to replicate something called the gravitational wave luminosity distance, given by d_{GW}(z), where z is the independent variable, which we call redshift.
The model states that
d_{GW}(z) = \sqrt{\frac{1-\lambda}{1-\lambda/E^2(z)}} e^{\frac{\lambda}{2} \frac{E^2(z) - 1}{E^2(z)}} \frac{3(1+z)}{h} \int_0^z \frac{1}{E(x)} dx,

where E(z) is a function which has to be solved numerically from the following equation
( E^2(z) - 2 \lambda) e^{\lambda/E^2(z)} = \Omega_m (1 + z)^3 + \Omega_r (1 + z)^4.

Observations then provide me with the redshift z for each event, it’s gravitational wave luminosity distance d_{GW}(z) and an error associated with that measurement \sigma(z).


Implementation in Stan

The data block is quite simple, and is given by

data {
  int N1;
  array[N1] real redshift;
  array[N1] real luminosity_distance;
  array[N1] real error;
}

The parameter block is quite simple as well, as I simply have to declare each of the three parameters, with the corresponding constrains

parameters {
  real<lower=0> h;
  real<lower=0,upper=1> Omega_m;
  real<lower=0,upper=1> Omega_r;
}

Assuming a Gaussian likelihood and wide Gaussian priors we have the following model block, where dGW is d_{GW}(z) given by the equation shown before

model {
  h ~ normal(0.7, 10);
  Omega_m ~ normal(0.284, 10);

  luminosity_distance ~ normal(dGW, error);
}

My problems now start in the transformed parameters block. My sequence is as follows:

  1. Compute the value of \lambda directly from \theta;
  2. Compute the value of E(z) at each redshift given by the dataset (in the code redshift[i]);
  3. Integrate 1/E(z) from 0 to redshift[i];
  4. Compute the final result for d_{GW}(z).

The following code is meant to represent that, however, it’s not fully functional

transformed parameters {
  array[N1] real dGW;  // luminosity distance for gravitational waves
  real correction;     // correction from modification to gravity
  real dL;             // background luminosity distance

  real E;       // value of E(z) evaluated at z = redshift[i]
  real lambda;  // quantity derived from Omega_m and Omega_r

  for (i in 1:N1) {
    lambda = 0.5 + lambert_w0( (Omega_m + Omega_r)/(2*exp(1/2)) );
    E = Efunc(redshift[i], theta, lambda, x_r, x_i);

    correction = ((1-lambda)/(1-lambda/E^2))^0.5 * exp(lambda/2 * (E^2 - 1)/E^2);
    dL = (1.0+redshift[i]) * (3/h) * integrate_1d(integrand, 0, redshift[i], {h, Omega_m, Omega_r}, x_r, x_i);
    dGW[i] = correction * dL
  }
}

The problems

1 - The algebraic solver - 10.1 Algebraic equation solver - takes vectors as inputs and outputs whereas the integrator - 10.4 1D integrator - uses arrays.
Obviously, I can convert one to the other, it doesn’t look good, and is quite confusing, but it works.

2 - I cannot communicate the variables redshift[i] and lambda to the algebraic system using the algebraic solver.
If I define, as specified in 10.1 Algebraic equation solver, the algebraic system which I would like to solve as Esystem, given by

// auxiliary function to compute E(z) with algebraic solver
vector Esystem(vector y, vector theta, data array[] real x_r, array[] int x_i) {
  real E = y[1];
  real Omega_m = theta[2];
  real Omega_r = theta[3];
  vector[1] output;

  // z = (?) - how do i communicate z?!
  lambda = 0.5 + lambert_w0( (Omega_m + Omega_r)/(2*exp(1/2)) );  // how do i communicate lambda to avoid computing this every time?!

  output[1] = (E^2 - 2*lambda)*exp(lambda/E^2) - Omega_m*(1+z)^3 - Omega_r*(1+z)^4
  return output
}

Which I would then call with another function

real Efunc(z, theta, lambda, x_r, x_i) {
  vector[1] y_guess;
  y_guess = 1;
  
  // annoying... algebra solver requires vector whereas integrator requires array and i declared it as an array
  vector[3] thetavec;
  thetavec[1] = theta[1];
  thetavec[2] = theta[2];
  thetavec[3] = theta[3];
  
  // how do i communicate z and lambda?!
  return algebra_solver_newton(Esystem, y_guess, thetavec, x_r, x_i)[1];
}

You can see that there is no way for me to communicate both redshift[i], which represents a given piece of data, nor lambda, a derived parameter.

3 - I cannot communicate lambda to the function integrand.
The function integrand is defined as follows:

real integrand(real x, real xc, real[] theta, real[] x_r, int[] x_i) {
  // (?) how do i get lambda over here?
  // (?) do i have to define this function? couldn't i just call the integrator in the function below?
  return (Efunc(x, theta, lambda, x_r, x_i))^(-1);
}

Although I could recompute its value, much like in the problem before, this would be extremely inefficient as it only has to be computed on every leapfrog step, not every function call.

4 - Is it possible to avoid defining the previous function and calling (Efunc(x, theta, lambda, x_r, x_i))^(-1) directly in the integrate_1d function in the transformed parameters?

5 - Regarding good practices, when should I be using a vector and when should I be using an array? As stated in problem 1, sometimes vectors are used, sometimes arrays are used. I know I can make an array of vectors, but specifically, for \theta, should I be using a vector or an array? And for my dataset?

6 - How it be smarter to define y(z) = E^2(z) and solve the algebraic system for y(z) instead of E(z) and then in the end compute its square root? Because I imagine that the algorithm will compute the square root at each step to see if it matches zero, something which is very expensive and avoidable in this case.


Any other tip, regarding optimization, good practices, etc. is more than welcome!
Thank you very much for your time, this post turned longer than I expected, and if you have any question, feel free to ask!

2 Likes

@andrjohns @betanalpha

1 Like

Are you sure you have all the signs right? That definition implies

\left(\lambda-\frac{1}{2}\right)e^{\lambda}e^{-\frac{1}{2}}=\frac{1}{2}\left(\Omega_m + \Omega_r \right) e^{-\frac{1}{2}}

but at z=0 the equation for E(z) simplifies to

\left(\frac{E^2(0)}{2} - \lambda\right) e^{\lambda/E^2(0)} = \frac{1}{2}\left(\Omega_m + \Omega_r\right)

which is just one sign change away from E(0) = 1. Of course I don’t know the relevant theory, that just seems a more natural initial condition for the integral.


The latest version of Stan has something called differential-algebraic equation solver which not only purports to solve both the algebraic relation and the differential equation simultaneously but also has simpler interface than the old algebraic solver.

I have not used the DAE solver before but let’s try and transform your problem into the appropriate form.
The redshift z acts as the “time” for the integral and you need to keep track of two states as function of time: S_1(z) = E(z) (or some function of it), and the integral of interest

S_2(z) = \int_0^z \frac{1}{E(z^\prime)}\mathrm{d}z^\prime

The Stan documentation does not really explain what the “residual function” does but from the IDAS guide I gather it defines an algebraic relation with the states and their derivatives in the same manner as algebra_solver does without the derivatives.
The two relations defined by the residual function are (1) E(z) obeys the algebraic equation and (2) the derivative of the second state is 1/E(z).

vector residual(real z, vector state, vector deriv, real lambda, real Omega_m, real Omega_r) {
  real E = state[1];
  real lhs = (E^2 - 2*lambda)*exp(lambda*E^-2);
  real rhs = Omega_m*(1+z)^3 + Omega_r*(1+z)^4;
  vector[2] res;
  res[1] = rhs - lhs; // E satisfies the algebraic relation
  res[2] = 1/E - deriv[2]; // derivative of S[2] equals 1/E
  return res;
}

Then you need the initial conditions at z=0. I already commented on the initial condition for E(z); if the signs are correct and E(0)\neq 1 then it must be obtained by solving the equation with algebra_solver. The initial value of S_2(z) is of course zero, and it’s initial derivative is 1/E(0). The initial derivative for E(z) can be found by differentiating the equation, which, if I did it right, gives

\left(2E-2\lambda E\left(z\right)^{-1}-\lambda^{2}E\left(z\right)^{-3}\right)e^{\lambda E^{-2}}\frac{\mathrm{d}E\left(z\right)}{\mathrm{d}z}=3\Omega_{m}\left(1+z\right)^{2}+4\Omega_{r}\left(1+z\right)^{3}

or at solving at z=0

\left[\frac{\mathrm{d}E\left(z\right)}{\mathrm{d}z}\right]_{z=0}=\frac{3\Omega_{m}+4\Omega_{r}}{\left(2E-2\lambda E\left(0\right)^{-1}-\lambda^{2}E\left(0\right)^{-3}\right)e^{\lambda E^{-2}}}
transformed parameters {
  array[N1] real dGW;  // luminosity distance for gravitational waves
  real lambda;  // quantity derived from Omega_m and Omega_r
  lambda = 0.5 + lambert_w0( (Omega_m + Omega_r)/(2*exp(0.5)) );
  {
    real E_0 = 1.0; // or something you get out of algebra_solver_newton(...)
    real deriv_0 = (3*Omega_r + 4*Omega_m)/((2*E_0 - 2*lambda/E_0 - lambda^2*E_0^3)*exp(lambda*E_0^-2));
    array[N1] vector[2] S = dae(residual, [E_0, 0.0]', [deriv_0, 1/E_0]', 0.0, redshift, lambda, Omega_m, Omega_r);
    for (i in 1:N1) {
      real E = S[i,1];
      real correction = ((1-lambda)/(1-lambda/E^2))^0.5 * exp(lambda/2 * (E^2 - 1)/E^2);
      dGW[i] = correction * (1.0+redshift[i]) * (3/h) * S[i,2];
    }
  }
}

Caveat lector, I didn’t test any of this code.

5 Likes

First of all I would like to thank you for your reply, and sorry for the delayed response.


Regarding the first equation, you are absolutely correct, there is a minus sign missing, \lambda shoud read instead
\lambda = \frac{1}{2} + W_0\left( - \frac{\Omega_m + \Omega_r}{2e^{1/2}} \right)

And E(0) = 1 is in fact my initial condition, and the derivative at z = 0 is given by
\left[ \frac{dE(z)}{dz} \right] \bigg\rvert_{z=0} = \frac{1}{2e^{1/2}} \frac{3\Omega_m + 4\Omega_r}{1-\lambda+2\lambda^2}

Which is slightly different from yours, as I think you made a small mistake in your derivation.


Regarding the differential-agebraic equation solver, I’m still trying to grasp the mathematics behind it, and I’m going to take my time in understanding what it’s really doing in the background.
Computing the derivative only of the integral and leaving everything else seemingly unchanged seems like some sort of black magic, so I’ll definitely look at the documentation page you have sent me.

When it comes to the code, it compiled!
I think I made a few changes to it for it to compile, and I’ll share it in a pastebin such that it can serve as a future reference - pastebin link.

However, it’s hitting a initialization issue. This is very likely an issue on my end, since for a very tiny dataset (with one observation only) it runs.
Either I’ve made a mistake deducing my equations (which is unlikely as I’ve made a different analysis in Python), my model as some issues (might be possible) or I’ve made a mistake writing the equations in the Stan code (might also be possible).
The error output is unfortunately very poor, so I’m not quite sure where it is crashing nor why.

That leaves me with just a few of questions:

  1. What are the ' after the arrays provided as arguments to the dae function? Are those to represent vectors?
  2. Why is res[2] = 1/E - deriv[2]? I imagine that I’ll be able to understand why after properly understanding the theory, but I will leave it here anyways.
  3. Is there a way to increase the verbosity or maybe compile the code with extra runtime warnings, as you can in C/C++, for example to detect out of bounds access, divisions by zero, etc.?

Just a curious fact, PyStan (PyStan v.3.4.0, and stanc --version shows v.2.29.2), which I have been using so far, works as expected, meaning that it has given me expected results in all the models I have successfully implemented so far (I don’t consider this one successful yet!).
RStan (v.2.21.5), on the other hand, refuses to compile all of my models.

Here’s a MWE of the code I’m using to run PyStan:

import numpy as np
import pandas
import stan

data = "data-fqgw/LISA-9.csv"
header = pandas.read_csv(data, comment="#", nrows=0).columns.tolist()
columns = pandas.read_csv(data, comment="#")
dic = {}
dic["N1"] = len(columns[header[0]])
for var in header:
    dic[var] = np.array(columns[var])

with open("model/LCDM.stan", "r") as file:
    program = file.read()
posterior = stan.build(program, data=dic)

fit = posterior.sample(num_chains=1, num_samples=500, num_warmup=100, init=[{"h": 0.68, "Omega_m": 0.353}], save_warmup=True)

Here’s the code I’m using to call RStan:

# import stan
library("rstan")

# set stan options
options(mc.cores = parallel::detectCores())
rstan_options(auto_write = TRUE)

# take relevant datasets
data = read.csv("data-fqgw/LISA-9.csv", header=TRUE, comment="#", stringsAsFactors=FALSE)

# model file
file = "model/LCDM.stan"

# compute the fit
fit = stan(file=file, data=data)

And here’s the output of RStan, error included:

Loading required package: StanHeaders
Loading required package: ggplot2
rstan (Version 2.21.5, GitRev: 2e1f913d3ca3)
For execution on a local, multicore CPU with excess RAM we recommend calling
options(mc.cores = parallel::detectCores()).
To avoid recompilation of unchanged Stan programs, we recommend calling
rstan_options(auto_write = TRUE)
SYNTAX ERROR, MESSAGE(S) FROM PARSER:
 error in 'modeldd198837df8b_LCDM' at line 55, column 109
  -------------------------------------------------
    53:   real dL[N1];
    54:   for (i in 1:N1) {
    55:     dL[i] = (1.0 + redshift[i]) * (2.9979/h) * integrate_1d(integrand, 0, redshift[i], {h, Omega_m}, x_r, x_i);
                                                                                                                    ^
    56:   }
  -------------------------------------------------

PARSER EXPECTED: ","
Error in stanc(file = file, model_code = model_code, model_name = model_name,  : 
  failed to parse Stan model 'LCDM' due to the above error.
Calls: stan -> stan_model -> stanc
Execution halted

And here’s the model LCDM.stan mentioned before (but the same holds true for the other 2-3 models that I have):

functions{
  real Einverse(real z, real Omega_m) {
    return ( Omega_m*(1.0+z)^3 + (1.0-Omega_m) )^(-0.5);
  }

  real integrand(real x, real xc, real[] theta, real[] x_r, int[] x_i) {
    real Omega_m = theta[2];
    return Einverse(x, Omega_m);
  }
}

data {
  int N1;
  real redshift[N1];
  real luminosity_distance[N1];
  real error[N1];
}

transformed data {
  real x_r[0];
  int x_i[0];
}

parameters {
  real<lower=0> h;
  real<lower=0, upper=1> Omega_m;
}

transformed parameters {
  real dL[N1];
  for (i in 1:N1) {
    dL[i] = (1.0 + redshift[i]) * (2.9979/h) * integrate_1d(integrand, 0, redshift[i], {h, Omega_m}, x_r, x_i);
  }
}

model {
  h ~ normal(0.7, 10);
  Omega_m ~ normal(0.284, 10);

  luminosity_distance ~ normal(dL, error);
}

generated quantities {
}

And here’s the dataset LISA-9.csvthat I’m using:

redshift,luminosity_distance,error
0.2038754949680942,0.9973479941207404,0.011403328052523299
5.675310681142508,58.96651922521995,7.159404028477212
3.030345541949446,24.14217228572887,2.2627470559681755
2.9860050334331976,21.73306547023548,2.2056285915204703
1.6778857316070677,13.157249685177666,0.5156167032004003
2.953005790569175,24.94812035525647,2.163676184123924
4.78525963782338,47.42467612412258,5.191650570689001
2.9489266093242246,27.552776829211496,2.1585232894046125
1.6708384892008887,12.724806457146624,0.5107135517816915
0.4396641734861283,2.391337791340394,0.02511570986661158
6.157698655324268,78.23711970018232,8.358615565263154
3.9109078896947467,31.941566552107346,3.5717163888271806
2.0418751397599966,16.19299585677514,1.1955161147369384
1.702040724920014,12.456628330105044,0.5326028404656467
1.546993370585049,10.804560247714772,0.42843970323623926

Am I doing something wrong in RStan? I’m not planning to use it, at all, but this error looks very strange to me.

2 Likes

Goes to show I should just learn to use a computer algebra system instead of trying to work these out by hand. That said, surely that’s e^\lambda instead of e^{1/2}.

It’s the transpose operator. The square brackets [] denote row vectors but DAE expects column vectors. (Unlike R, Stan makes a distinction between arrays and different vectors.) The transpose swaps the rows and columns of a matrix, or in this case, a vector.

The solver finds a solution for which res = 0 so and deriv is the derivative of the state so that line implements \frac{1}{E}-\frac{\mathrm{d}S_2}{\mathrm{d}z} = 0 which is residual form of \frac{\mathrm{d}S_2}{\mathrm{d}z} = \frac{1}{E}, that is, S_2 = \int \frac{1}{E}\mathrm{d}z.

You may have other questions (at least, I do), like why not res[2] = deriv[2] - 1/E or what happens if you swap res[1] and res[2] or why doesn’t the function use deriv[1]. As I said, I have not used the DAE solver before but I’d guess the exact form of the residual function does not change the result (as long as the zero of the residual is the same) but does affect the time it takes to find the solution. In particular, the constraint on E also implies a constraint on the derivative of E so presumably the solver will figure it out somehow.

I don’t think so. You can add print() statements to your model to see intermediate values but that’s about it.

The DAE solver is only available in Stan 2.29 and I think even integrate_1d is more recent than 2.21. Some technical problems have prevented updating RStan on CRAN for a quite a while now so you’d need to install from a different source to get an up-to-date RStan.

I forgot to mention the DAE solver (like ODE solvers) requires the times to be in increasing order. So you need to sort the data by redshift. (or run the solver in a loop for each data point)

1 Like

Yep you’re right, third time is the charm.

I suspected so, thanks!

Okay, I’m going to have a field day properly understanding why this works.
As I said, currently it’s some sort of black magic!

That’s a shame, considering that the compiler already has built in flags for similar features, it would be cool if you could pass them directly.
Probably easier said than done, but it would be cool nonetheless!

Ah, good to know, I was trying to figure out the underlying version of Stan that RStan uses, but I failed, so I assumed it would be up to date, but I guess not. Thanks again.

Yup, that was it, it executed properly and the results seem to make sense!
Will leave it running for longer to ensure that it’s in fact recovering the values which I used to generate the data, but so far the regions are all quite tight around the expected result.