In order to familiarize with Stan and Bayesian Inference,I’m trying to fit a Normal distribution in a Triangle distribution and infer a e b parameters (just a toy project):
Triangle(y | a,b)=2/(b-a)*(1-ABS(y-(a+b)(b-a)))
STAN CODE
functions {
real triangle_lpdf(real x,real a,real b)
{
return log(2/(b-a)) + log1m(fabs(x-(a+b)/(b-a)));
pairs }
}
data {
int N;
real x[N];
}
parameters {
real a;
real b;
}
model {
a ~ normal(-2,0.5);
b ~ normal(2,0.5);
for(n in 1:N)
//target += log(2/(b-a)) + log1m(fabs(x[n]-(a+b)/(b-a)));
target += triangle_lpdf(x[n] | a,b);
}
R CODE
data=list(x=rnorm(100,0,0.3),N=100)
fit=stan(file=“TestTriangle.stan”,data=data,chains = 1)
RESULT
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
a 0.0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 28 1.04
b 0.0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 14 1.08
lp__ 7807.7 6.25 19.67 7766.87 7814.33 7814.33 7814.33 7814.33 10 1.12
while I would expect something like:
-1<a<0
0<b<1
I understated this is more a didactic question nevertheless I’d like some advise on why my assumptions are not correct.
Thank You
Having a non-working model is a great exercise for understanding Stan. I suggest plotting the traces of a and b next to see if something obvious went wrong. The low n_eff (effective sample size, this would be close to the number of iterations in a perfect sampler) is a hint that something did go wrong.
@sakrejda thank You for the suggestion.
I suppose the problem is that the following condition must hold:
fabs( (x-(1+b)/(b-a)) ) <= 1 (Equ.1)
for each x
The problem is that to me is not clear how I can constrain parameters a and b in order to satisfy (Equ.1).
I know that I can write statements like:
parameters {
real <lower=1> b;
real<upper=b> a;
}
But this is not sufficient.
How can I express (Equ.1)?
Think about max(x)
and min(x)
.
@bgoodri Thank You!
I set:
parameters {
real<upper=min(x)> a;
real<lower=max(x)> b;
}
and I’m getting what I’m expecting (still some rejected samples but much less).
The triangle density does have some problems with tails being too light to sample correctly but I don’t recall if it’s supposed to throw divergences. If you compare expected to observed tail probabilities they should be off (but that requires many chains each run for many iterations so it’s usually not straightforward to measure). Just mentioning it since you’re starting out the model has a known issue.
Right. We never actually use a triangle distribution for anything other than illustrative purposes. We much prefer softer tails and continuously differentiable densities—they tend to work better both statistically and computationally once you get outside the central trunk of the distribution.
Bob, Sakrejda,
thank You for the clarification.
I knew that there was some problems with Triangle distribution but I’m just playing with Stan in order to familiarize with it. In this respect I find that trying out “exotic” thing help me understand better what I’m dealing with.
So far the community has been very helpful.
Thank You!
1 Like