I’m trying to fit a hierarchical model with a Categorical likelihood and some hierarchical Dirichlet priors. I’ve worked on quite a number of models and have made the experience that vectorisation, even though sometimes making my code uglyer, for the Dirichlet statements makes the sampling much faster.
I’ve now encountered a case where the sampling is around 10-20% slower compared to the non-vectorised version. I tried to make sense of what I see but couldn’t, it seems unlikely to me that the overhead of writing an array for the vectorised operation can be so significant.
I might oversee something here. I’ll upload the two model versions.
By the way, I would really love to see if the Categorical distribution would be vectorised, but the related issue seems inactive. Right now, I have to rely on a for loop since the right hand side is different for every call.
I also wanted to ask whether somebody sees a smarter way, based on the non-vectorised version, to reduce the two-dimensional arrays theta (and / appended to) phi so that I can feed it into two (ideally one) Dirichlet statement(s) without manually recreating these lhs/rhs arrays?
model$sample(
data = stan_data,
chains = 4,
iter_warmup = 1000,
iter_sampling = 2000,
refresh = 50,
parallel_chains = 4,
save_warmup = 1,
seed = 1458L
)
This runs on 4 cores and on my machine takes around 640 vs. 560 seconds.
vectorised.stan (1.5 KB)
non-vectorised.stan (1.0 KB)
baseline_stan_data.rdata (696 Bytes)
Please note: I had to rename the .rds file to .rdata because of an upload restriction.