Hi.
I’m trying to fit a mixture of two log-logistic distributions to
a small dataset. The data are in terms of a minimum and maximum
possible interval for each record, so for each set of parameters
there is a probability of the actual time period falling within
that interval.
I have some R, stan code and data is included below.
Example outputs from a run:
Inference for Stan model: mixll.
10 chains, each with iter=50000; warmup=25000; thin=50;
post-warmup draws per chain=500, total post-warmup draws=5000.
mean se_mean sd 2.5% 97.5% n_eff Rhat
dist_shape_o[1] 0.34 0.00 0.05 0.25 0.45 4410 1
dist_shape_o[2] 0.56 0.00 0.21 0.16 0.74 3318 1
dist_scale_o[1] 0.97 0.00 0.29 0.73 1.71 3711 1
dist_scale_o[2] 3.67 0.00 0.11 3.29 3.73 1759 1
dist_shape[1] 3.09 0.04 1.84 2.22 4.02 1914 1
dist_shape[2] 2.26 0.02 1.41 1.35 6.20 3713 1
dist_scale[1] 19.40 0.11 6.92 14.51 38.80 3808 1
dist_scale[2] 275.60 0.58 24.68 187.62 292.28 1797 1
theta 0.36 0.00 0.10 0.18 0.58 3105 1
lp__ -109.57 0.06 3.48 -116.52 -103.32 3220 1
I use a long warmup period and only keep every 50th sample, but
my effective number of samples seems relatively low to me.
There is a (possibly unnecessary) level of complexity in there
as I use the mean and variance of a set of parameters from
another published analysis as priors. In this previous analysis
the parameterisation of the log-logistic was different and the
data were stored as weeks, rather than days. I transform the
parameters to account for this. Not being completely confident
about the applicability of these priors, I also used the
t distribution rather than the Normal, giving more weight to
the tails.
Does anyone have any suggestions for improving my analysis,
for example better priors or some alternate parameterisation which
might reduce the correlations between the parameters.
Or is the behaviour purely a result of my inadequate dataset?
Sorry for the vague query, I wasn’t sure where to focus it.
Thanks,
Ron.
Code:
library(rstan)
stan_code <- '
/*
Mixture of two log-logistic distributions fitted
to the available interval data
*/
functions {
real llogistic_lcdf( real x, real scale, real shape ){
return -log1p_exp(lmultiply(-shape,x/scale));
}
}
data {
int<lower=1> N; // number of data points
real<lower=1> x_lower[N]; // interval observations: lower
real x_upper[N]; // & upper
}
parameters {
real<lower=0,upper=1> theta;
/* Parameters on scale of previous analysis */
vector<lower=0>[2] dist_scale_o;
vector<lower=0>[2] dist_shape_o;
}
transformed parameters {
vector[2] dist_scale;
vector[2] dist_shape;
vector[2] lp_parts[N];
vector[2] log_theta;
/* Convert from FAdist::pllog parameterisation with data in weeks
to current parameterisation and data in days */
dist_scale = exp(dist_scale_o)*7;
dist_shape = 1.0 ./ dist_shape_o;
log_theta[1] = log(theta);
log_theta[2] = log1m(theta);
for( i in 1:N ){
for( j in 1:2 ){
lp_parts[i,j] = log_theta[j] +
log_diff_exp( llogistic_lcdf( x_upper[i] | dist_scale[j], dist_shape[j] ),
llogistic_lcdf( x_lower[i] | dist_scale[j], dist_shape[j] ) );
}
}
}
model {
theta ~ beta(3,3);
/* Mean and s.d. of results of old analyses as priors,
student_t used to make it fat-tailed to make them less strong */
dist_scale_o[1] ~ student_t(1,0.78,0.0175); // df,mu,sigma
dist_scale_o[2] ~ student_t(1,3.7,0.0065);
dist_shape_o[1] ~ student_t(1,0.33,0.0157);
dist_shape_o[2] ~ student_t(1,0.73,0.0026);
for( i in 1:N) target += log_sum_exp(lp_parts[i]);
}
'
sm <- stan_model(model_code=stan_code,model_name='mixll')
n_warm_up <- 25000
n_samples <- 5000
n_thin <- 50
n_chains <- 10
n_samples_per_chain <- as.integer(n_samples/n_chains)
dl <- structure(list(N = 43L, x_lower = c(1, 1, 2, 1, 1, 1, 4, 6, 5,
14, 12, 16, 21, 24, 20, 7, 25, 37, 52, 57, 66, 59, 74, 102, 28,
91, 186, 197, 197, 202, 232, 243, 239, 245, 263, 273, 280, 301,
308, 312, 345, 482, 517), x_upper = c(134, 41, 170, 1121, 164,
18, 39, 48, 61, 28, 26, 34, 35, 32, 48, 511, 60, 289, 108, 141,
94, 171, 116, 158, 393, 231, 326, 365, 225, 286, 260, 271, 575,
329, 291, 393, 308, 315, 476, 340, 359, 762, 531)), .Names = c("N",
"x_lower", "x_upper"))
il <- list( theta=0.5, dist_scale_o=c(0.78,3.7), dist_shape_o=c(0.33,0.73) )
iln <- rep(list(il),n_chains)
smo <- sampling( sm, data=dl, init=iln, chains=n_chains,
pars=c("dist_shape_o","dist_scale_o",
"dist_shape","dist_scale","theta"),
warmup=n_warm_up, iter=n_warm_up+(n_samples_per_chain*n_thin),
thin=n_thin, cores=n_chains,
control = list(max_treedepth = 15) )