Hi,
In my toy example there are two processes that have a gamma distribution (with the same scale 1), and in every trial the one with the largest outcome is “selected”. In R, the process would look like this:
N <- 1000
alpha1 <-4
alpha2 <-3
y1 <- rgamma(N,alpha1)
y2 <- rgamma(N,alpha2)
And my data would look like this:
data <- list(N=N, y = pmax(y1, y2))
I want to estimate alpha1
and alpha2
.
My logic was the following, the proportion of time that y1 was selected is:
mean((y1/y2) >1 )
# 0.642
But this can also be estimated analytically with the Beta prime distribution:
library("extraDistr")
pbetapr(1,alpha2,alpha1)
# 0.65625
So I’m trying to use the cdf of the beta prime distribution (betapr_cdf in my model
) to estimate the mixing proportions, and my idea is that when gamma1 is selected then gamma2 is “censored”, it had a smaller value and then I use the cdf of the gamma, and the other way around when gamma1 is selected.
I’m writing because obviously, it didn’t work!
(I’m starting with super strong priors, but it doesn’t work anyways).
functions {
real betapr_cdf(real x, real alpha, real beta){
return inc_beta(alpha, beta,x/(1+x));
}
}
data {
int N;
real<lower=0> y[N];
}
parameters {
real<lower=0> alpha1;
real<lower=0> alpha2;
}
transformed parameters{
// Probability of selecting P1:
real <lower= 0, upper=1> P1 = betapr_cdf(1,alpha2, alpha1);
}
model {
alpha1~ normal(4,.1);
alpha2 ~ normal(3,.1);
for (n in 1:N){
target += log_mix(P1, gamma_lpdf(y[n]|alpha1,1) + gamma_lcdf(y[n]|alpha2,1),
gamma_lpdf(y[n]|alpha2,1) + gamma_lcdf(y[n]|alpha1,1));
}
}
The problem is that it doesn’t even start:
out <- stan("rate.stan", data=data, chains =1,iter=500)
Chain 1: Rejecting initial value:
Chain 1: Gradient evaluated at the initial value is not finite.
Chain 1: Stan can't start sampling from this initial value.
Chain 1:
Chain 1: Initialization between (-2, 2) failed after 100 attempts.
Chain 1: Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] "Error in sampler$call_sampler(args_list[[i]]) : Initialization failed."
error occurred during calling the sampler; sampling not done
And when I print different parts of the model I noticed that P1
is stuck at 0
in the initialization, but it shouldn’t be because alpha1 and alpha2 have sensible values in the same iteration!
So I’m stuck!
(I’ve already checked that betapr_cdf
is working fine…)
Thanks!