Gaussian quadrature inside stan?

kwhkim · August 1, 2019, 8:34pm

Hi everybody,

My question is how can i do gaussian quadrature?

I am doing IRT-like model.

Since i have over 10000 subjects, I need to do marginalize the likelihood

so that the number of subjects would not be a problem.

And person parameters are also nuisance parameters…

I hear it’s doable with ODE solver,

But I am new to stan and I wonder if somebody can help me with it…

What I need to do is with th ~ N(0,1), integrate f(x)p(x) dx with p(x) normal distribution

Thanks in advance

bgoodri · August 1, 2019, 10:46pm

Gaussian quadrature is not exposed in Stan but the integrate_1d function does tanh-sinh quadrature which is better for a wider class of one-dimensional integrals.

tpapp · August 2, 2019, 6:33am

You can also precalculate the nodes and weights in the software of your choice, and then provide them as data.

kwhkim · August 2, 2019, 6:52am

Well the problem is the function is also a function of other parameters… I am wondering integrate 1d can manage those functions

2019년 8월 2일 (금) 오후 3:33, Tamas K Papp via Stan mc_stan@discoursemail.com님이 작성:

tpapp · August 2, 2019, 7:05am

Sorry, I don’t understand. Given nodes and weights, Gaussian quadrature is just a weighted sum (or a dot product). When integrating

g(\lambda) = \int_{-\infty}^{\infty} f(x, \lambda) p(x) dx \approx \sum_i f(x_i, \lambda) w_i

where p(x) is the density of N(0, 1), you are free to calculate f in any arbitrary way at the fixed x_i.

kwhkim · August 2, 2019, 7:14am

Yes in your formula f has another variable lambda and in my case it is not data but it is a model parameter so i have to do integrate at hand to incorporate the fact that the integration depends on model parameters…

2019년 8월 2일 (금) 오후 4:05, Tamas K Papp via Stan mc_stan@discoursemail.com님이 작성:

tpapp · August 2, 2019, 12:38pm

I don’t see why \lambda cannot be a parameter.

Can you provide some Stan code (without the integral calculation part, of course) to make the discussion more concrete?

kwhkim · August 3, 2019, 6:41am

I have yet to translate JAGS to STAN but in case you understand JAGS,

My initial model goes like this

model {
  for (n in 1:N) {
    for (i in 1:I) {
      for (k in 1:K) { w[n, i, k] <- (1 - alpha[n, k])*Q[i, k] }
      IRT[n, i] <- 1/(1+exp(-1.7*(th[n] + easy[i])))
      p[n, i] <- pai_star[i] * prod(pow(r_star[i, 1:K], w[n, i, 1:K])) * IRT[n, i]
      Y[n, i] ~ dbern(p[n, i]) 
    }
    for (k in 1:K) { alpha[n, k] <- all.patterns[c[n], k]}
    c[n] ~ dcat(pai[1:C]) }
    
  pai[1:C] ~ ddirch(delta[1:C]) 


  for (n in 1:N) {
    th[n] ~ dnorm(0, 1)
  }      
  for (i in 1:I) {
    pai_star[i] ~ dbeta(a.pai_star, b.pai_star)
    easy[i] ~ dunif(-3, 3)
    for (k in 1:K) {
      r_star[i, k] <- xr_star[i, k] * Q[i, k]
      xr_star[i, k] ~ dbeta(a.xr_star, b.xr_star) }}
}

This is called “reparameterised unified model(RUM)”, aka Fusion model if you are interested. And I want to marginalised the likelihood over th so that too many i’s(subjects) would not a problem.

“Marginalization” here I mean something when people do for frequentist Mixed effect model, people do not estimate individual group effect but just variance given that the individual effects are normally distributed.

So for my model,
IRT[n, i] <- 1/(1+exp(-1.7*(th[n] + easy[i]))) should be

\int_\theta \frac{1}{(1+exp(-1.7 \times (\theta + easy[i])))}d\theta

Actually the model above also has individual alpha[n, k] (with n is an individual index) but I also need to marginalise that as in (https://journals.sagepub.com/doi/10.1177/0146621617707511)

most of the times, delta[1:C] is just a vector of 1’s and all.patterns is a vector of \{0,1\}^K.

PS. distribution of \theta can be parameterised like \theta \sim \mathcal{N}(0, \sigma_\theta^2) and logistic function can be approximated by normal cdf.

nhuurre · August 3, 2019, 12:48pm

I don’t know much JAGS but I think I can translate your model to Stan.
First, let’s just write the type declarations and create some fake Stan code that closely resembles the original JAGS.

data {
  int<lower=1> N;
  int<lower=1> I;
  int<lower=1> K;
  int<lower=1> C;
  vector<lower=0>[C] delta;
  real<lower=0> a_pai_star;
  real<lower=0> b_pai_star;
  real<lower=0> a_xr_star;
  real<lower=0> b_xr_star;
  int<lower=0,upper=1> Q[I,K];
  int<lower=0,upper=1> all_patterns[C,K];
  int<lower=0,upper=1> Y[N,I];
} parameters {
  simplex[C] pai;
  vector[N] th;
  real<lower=0,upper=1> pai_star[I];
  real<lower=-3,upper=3> easy[I];
  real<lower=0,upper=1> xr_star[I, K];
  int<lower=1,upper=C> c[N];
} transformed parameters {
  real r_star[I,K] = xr_star .* Q;
  int alpha[N,K] = all_patterns[c];
  int w[N,I,K] = (1 .- alpha) .* Q;
  real IRT[N,I];
  real p[N,I];
  for (n in 1:N) for (i in 1:I) {
    IRT[n,i] = inv_logit(1.7*(th[n] + easy[i]));
    p[n,i] = pai_star[i] * IRT[n,i];
    for (k in 1:K)
      p[n,i] *= pow(r_star[i,k],w[n,i,k]);
  }
} model {
  c ~ categorical(pai);
  pai ~ dirichlet(delta);
  th ~ std_normal();
  pai_star ~ beta(a_pai_star, b_pai_star);
  xr_star ~ beta(a_xr_star, b_xr_star);
  easy ~ uniform(-3, 3);
  Y ~ bernoulli(p);
}

This will not compile because Stan does not allow discrete parameters and there are some vectorization mistakes but it should be straightforward to compare with the JAGS code. Next step is to merge model and transformed parameters blocks into a single model block and move all variables into the innermost loop possible.

model {
  for (n in 1:N) {
    for (i in 1:I) {
      real r_star[K] = xr_star[i] .* Q[i];
      int alpha[K] = all_patterns[c[n]];
      int w[K] = (1 .- alpha) .* Q[i];
      real IRT = inv_logit(1.7*(th[n] + easy[i]));
      real p = pai_star[i] * IRT;
      for (k in 1:K)
        p *= pow(r_star[k], w[k]);
      Y[n,i] ~ bernoulli(p);
    }
    c[n] ~ categorical(pai);
  }
  pai ~ dirichlet(delta);
  th ~ std_normal();
  pai_star ~ beta(a_pai_star, b_pai_star);
  xr_star ~ beta(a_xr_star, b_xr_star);
  easy ~ uniform(-3, 3);
}

Then simplify the code and eliminate some variables. When Q=0 you have pow(0,0) which I assume evaluates to 1 in JAGS. Also here I replace the sampling statements with target += in preparation for the last step.

model {
  for (n in 1:N) {
    for (i in 1:I) {
      real IRT = inv_logit(1.7*(th[n] + easy[i]));
      real p = pai_star[i] * IRT;
      for (k in 1:K)
        if (all_patterns[c[n],k] == 0 && Q[i,k] == 1)
          p *= xr_star[i,k];
      target += bernoulli_lpmf(Y[n,i]|p);
    }
    target += categorical_lpmf(c[n]|pai);
  }
  pai ~ dirichlet(delta);
  th ~ std_normal();
  pai_star ~ beta(a_pai_star, b_pai_star);
  xr_star ~ beta(a_xr_star, b_xr_star);
  easy ~ uniform(-3, 3);
}

As I mentioned, Stan cannot sample discrete parameters. Therefore we must marginalize out c. The marginalized model is

data {
  int<lower=1> N;
  int<lower=1> I;
  int<lower=1> K;
  int<lower=1> C;
  vector<lower=0>[C] delta;
  real<lower=0> a_pai_star;
  real<lower=0> b_pai_star;
  real<lower=0> a_xr_star;
  real<lower=0> b_xr_star;
  int<lower=0,upper=1> Q[I,K];
  int<lower=0,upper=1> all_patterns[C,K];
  int<lower=0,upper=1> Y[N,I];
} parameters {
  simplex[C] pai;
  vector[N] th;
  real<lower=0,upper=1> pai_star[I];
  real<lower=-3,upper=3> easy[I];
  real<lower=0,upper=1> xr_star[I,K];
} model {
  for (n in 1:N) {
    real lp1[C];
    for (c in 1:C) {
      real lp2[I+1];
      for (i in 1:I) {
        real IRT = inv_logit(1.7*(th[n] + easy[i]));
        real p = pai_star[i] * IRT;
        for (k in 1:K) if (all_patterns[c,k] == 0 && Q[i,k] == 1)
          p *= xr_star[i,k];
        lp2[i] = bernoulli_lpmf(Y[n,i]|p);
      }
      lp2[I+1] = categorical_lpmf(c|pai);
      lp1[c] = sum(lp2);
    }
    target += log_sum_exp(lp1);
  }
  pai ~ dirichlet(delta);
  th ~ std_normal();
  pai_star ~ beta(a_pai_star, b_pai_star);
  for (i in 1:I)
    xr_star[i] ~ beta(a_xr_star, b_xr_star);
  easy ~ uniform(-3, 3);
}

The simplification removed alpha from the output but I don’t think you can marginalize th.

tpapp · August 3, 2019, 2:14pm

I may be missing something, but I am assuming that you want th not as a parameter, but as an integrand in the expression

multiplied by the standard normal density in th.

Then, as I said, you want to pass in th_weights and th_nodes as data for Gaussian quadrature (note that the standard normal pdf does not exactly coincide with the Gauss-Hermite weight function, there is a scale adjustment), and simply evaluate

dot_product(1/(1+exp(-1.7*(th_nodes + easy[i]))), th_weights)

which I think should vectorize.

kwhkim · August 4, 2019, 1:43pm

Wow, I think this would work.

I could not see this way, thank you!

andrjohns · August 4, 2019, 2:24pm

Very minor, but you could also simplify that further to:

dot_product(inv_logit(-1.7*(th_nodes + easy[i])), th_weights)

Since the inv_logit function is also built to handle extreme values better than hand-coding the math

Topic		Replies	Views
How to do integral in stan Modeling	17	3546	June 23, 2017
User defined probability function involving an integral Modeling	2	373	May 23, 2019
Integration in joint likelihood Modeling	3	838	August 26, 2017
High dimensional interpolation of a high dimensional integral Modeling	10	1146	March 19, 2018
Multidimensional Integration (Multiple Integrals) in Stan General rstan , techniques	1	725	August 16, 2022

Gaussian quadrature inside stan?

Related topics