Generating random values from a user-defined distribution

Hello, everyone!

I am a fresher about Stan, especially Rstan. Recently, I have been thinking about a question which puzzled me a lot:

Suppose X\sim f(x) where f(x) is the probability density function of r.v. X, which does not have the usual distributions expressions, such as Normal, Gamma, but a complex expression. And now I want to sample some random values from this distribution. How can I finish this job?

In Rstan, I set the stan code like this:

parameters {
  real x;
}
model {
  x ~ f(x);
}

But it doesn’t work. Can anyone help me?
I am really appreciate for your kind help!!

In the function declaration call it f_lpdf to use the ~ syntax with custom distributions.

1 Like

It is much obliged for your reply. And according to your answer, i rewrite some codes with sampling from a user defined distribution. Is this right?

code='
  functions {
     real uf_lpdf(real y) {
         return 1/3*exp(-1/3*y);
     }
  }
  parameters {
     real<lower=0> k;
  }
  model {
    k ~ uf_lpdf( );
  }'
fit <- stan_model(model_code=code)

Thanks a lot!

If I recall correctly, you don’t need the _lpdf in the model block. Your Stan model would look like:

functions {
  real uf_lpdf(real y) {
    return ((1/3) * exp(-(1/3) * y));
  }
}

data {
}

parameters {
  real <lower = 0> k;
} 

model {
  k ~ uf();
}
1 Like

Thanks, and I changed my Stan code based on your idea. However, some new problems arose~

You see that Y \sim exp(\lambda) where \lambda=3 in my Stan code.

### Trunk A
functions {
    real uf_lpdf(real y) {
    return (1/3 * exp(-(1/3) * y));
  }
}

So equivalently, results with above code should be the same with the following code:

##Trunk B
functions {
    real uf_lpdf(real y) {
    return exponential(y | (1/3));
  }
}

But, the code with Trunk A has some warnings:

1: There were 5150 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup 
2: Examine the pairs() plot to diagnose sampling problems

and the outcome is unbelievable:

       mean se_mean   sd          2.5%           25%
k    9.075119e+307     NaN  Inf 5.540203e+306 4.726369e+307
lp__  7.088200e+02    0.03 0.94  7.063000e+02  7.084500e+02

while the outcome with Trunk B is normal:


      mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
k     0.34    0.01 0.34  0.01  0.09  0.23  0.48  1.25  1332 1.00
lp__ -1.59    0.03 0.82 -3.85 -1.80 -1.28 -1.06 -1.00   767 1.01

Do you know why the outcomes are so different?

This is the probability density function, not the log probability density function. The unnormalised log probability density function of the exponential distribution with rate parameter \lambda = 3 is:

functions { 
  real uf_lpdf(real y) { 
  return (-(1/3) * y); 
  }
}
2 Likes

I got it!
Thanks a lot!!