Coding a custom density with discontinuities

Steven_Xu · April 13, 2019, 3:43am

Hi guys. I am fairly new to Stan, so this might be a really elementary question. Is it possible to write a custom density function that contains discontinuities and use it to model a likelihood? I have searched the forum and did not find any relevant questions.

For my problem I have checked to ensure my custom likelihood is grammatically correct in the sense that it can correctly calculate the density values of given data points when using fix parameters. Now I am trying to see if I assign some parameters to be unknown and feed my density the simulated data, it can give me back the correct values of the parameters. However, when I try to run RStan it gives me the error of

Chain 1:   Rejecting initial value:
Chain 1:   Gradient evaluated at the initial value is not finite.
Chain 1:   Stan can't start sampling from this initial value.

I know this is probably due to the reason that my density contains discontinuities, so do I have to write my own analytical gradient function? I can paste my code if necessary. Thank you in advance for your help!

sakrejda · April 13, 2019, 2:32pm

What’s the source of discontinuities? The answer is problem dependent. By default the integrator won’t play nice with step functions, could be ok with discontinuities in the gradient sometimes, but solutions are problem specific.

Steven_Xu · April 13, 2019, 8:06pm

I am doing simultaneous quantile regression by modelling the conditional quantile function piece-wise over L intervals separated by breakpoints 0<\kappa_1<\kappa_2<...<\kappa_L =1, i.e. a fixed basis expansion that takes the form

q(\tau|\mathbf{x}_i) = \sum_{l=1}^{L}\left\{\alpha_{0l}+\alpha_{1l}x_i\right\}B_{l}(\tau)

where B_{l}(\tau) is a function of the quantile function of a base density. This leads to my density on y being

f_Y(y|\mathbf{x})=\sum_{l=1}^L\frac{I\big[q(\kappa_{l-1}|\mathbf{x})<y\le q(\kappa_{l}|\mathbf{x})\big]}{\alpha_{0l}+\alpha_{1l}x}f_0\left(\frac{y-I(l>1)\left[q(\kappa_{l-1}|\mathbf{x})-(\alpha_{0l}+\alpha_{1l}x)q_0(\kappa_{l-1})\right]}{\alpha_{0l}+\alpha_{1l}x}\right)

functions{
    matrix thresh(matrix alpha_star, real eps){
      int M = cols(alpha_star);
      int N = rows(alpha_star);
      matrix[N,M] alpha_temp;
      for(l in 1:M){
        real WC = alpha_star[1,l]-sum(fabs(alpha_star[2:N,l]));
        if(WC < eps){
          alpha_temp[1,l] = eps;
          alpha_temp[2:N,l] = rep_vector(0,N-1);
        }else{
          alpha_temp[,l] = alpha_star[,l];
        }
      }
      return alpha_temp;
    }
    int which_less(row_vector x, real a){
      int loc = 1;
      while((loc<=num_elements(x))&&(x[loc]<a)){
        loc += 1;
      }
      return loc-1;
    }
    real bqrnn_lpdf(real y, row_vector u_vec, matrix alpha_star, matrix basis_m, real q_base){
        real den;
        real lden;
        real u_a;
        matrix[rows(alpha_star),cols(alpha_star)] alpha;
        row_vector[cols(basis_m)] q_model;
        alpha = thresh(alpha_star,0.1);
        q_model = (u_vec*alpha)*basis_m;
        if(y<=q_model[1]){
          u_a = u_vec*alpha[,1];
          den = 1/u_a*exp(normal_lpdf((y/u_a)|0,1)); 
        }else{
          int idx = which_less(q_model,y);
          u_a = u_vec*alpha[,idx+1];
          den = 1/u_a*exp(normal_lpdf(((y-q_model[idx]+u_a*q_base)/u_a)|0,1));
        }
        lden = log(den);
        print(lden);
        return lden;
    }
}

The thresh function is a constraint function to ensure the monotonicity of quantile function. u_vec represents \begin{pmatrix}1 & x_i\end{pmatrix}.

sakrejda · April 13, 2019, 8:46pm

Are the alphas and x's parameters here or just the alphas?

Steven_Xu · April 13, 2019, 8:49pm

Just the alphas.

sakrejda · April 13, 2019, 9:58pm

Given that the alphas are only creating discontinuities by changing the membership of a given y in a given l I’m pretty sure this can be written as a mixture (in terms of Pr[y \in l = 1, ..., L] rather than in terms of indicator functions. In that case it’s not discontinuous in \alpha and this should be fine. Have you gone down that road yet?

Steven_Xu · April 13, 2019, 10:30pm

I haven’t. I will give it a try. Thanks for the advice!

Steven_Xu · April 14, 2019, 3:07am

If I am understanding correctly, my goal is to write f_Y(y|\mathbf{x}) as \sum_{l=1}^L\text{Pr}(q(\kappa_{l-1}|\mathbf{x})<y\le q(\kappa_l|\mathbf{x}))f_l(y|\mathbf{x}) ? The weights in my problem are actually just \kappa_1,\kappa_2-\kappa_1,\cdots, 1-\kappa_{l-1}. But is there a way to build direct connection between f_l(y|\mathbf{x}) and the original pdf?

sakrejda · April 14, 2019, 6:57pm

From your equation I see that alpha contributes to q so I’m not sure how kappa alone can play that role.

Steven_Xu · April 14, 2019, 7:20pm

Integrating the above density over y gives the CDF

F_Y(y|\mathbf{x})=\begin{cases} F_0\left(\frac{y}{\alpha_{01}+\alpha_{11}x}\right), \ q(\kappa_0|\mathbf{x})<y\le q(\kappa_1|\mathbf{x})\\\\ F_0\left(\frac{y-q(\kappa_{l-1}|\mathbf{x})+(\alpha_{0l}+\alpha_{1l}x)q_0(\kappa_{l-1})}{\alpha_{0l}+\alpha_{1l}}x\right), \ q(\kappa_{l-1}|\mathbf{x})<y\le q(\kappa_l|\mathbf{x}) \end{cases}

I should mention that

\begin{align*} B_{1}(\tau) &= \begin{cases} q_0(\tau), \ \ \ \tau\le \kappa_1 \\ q_0(\kappa_1), \ \tau>\kappa_1 \end{cases} \ \ \ and \\\\ B_{l}(\tau)&= \begin{cases} 0, \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \tau\le\kappa_{l-1} \\ q_0(\tau)-q_0(\kappa_{l-1}), \ \ \kappa_{l-1}<\tau\le\kappa_{l} \\ q_0(\kappa_l)-q_0(\kappa_{l-1}), \ \tau>\kappa_l \end{cases} \end{align*}

therefore Pr (y\le q(\kappa_l|\mathbf{x})) for example is F_0\left(\frac{[\alpha_{01}+\alpha_{11}x]q_0(\kappa_1)}{\alpha_{01}+\alpha_{11}x}\right)=F_0(q_0(\kappa_1))=\kappa_1.

Topic		Replies	Views
Discontinuous likelihood and initialization failure Modeling rstan , initialization	4	597	July 7, 2022
Obtain gradient of function (e.g. multi_student_t_lpdf) within Stan? Estimating approximate density functions Modeling	1	366	January 23, 2020
Differentiating conditionals and while loops Algorithms	13	4687	August 10, 2018
Yet another case of (Rejecting initial value: Gradient evaluated at the initial value is not finite.) Modeling	3	515	July 18, 2022
Fitting a known piecewise linear function with unknown offset Modeling	21	2245	November 23, 2020

Coding a custom density with discontinuities

Related topics