The Stan manual has a section on speeding up the estimation for zero-inflated model. Basically you would break up the data into the zeros and non-zeros, and vectorizing the corresponding likelihood calculations. Give that a try. Best of luck

Looks like you are fitting a very large dataset - what is your N,K,J? Before optimizing too hard, I would try to fit a subset of the data (potentially filtering for rows that have a fixed value of a predictor and thus also reducing the model) and check that the model actually fits the data well and has no convergence problems. It is always frustrating to burn a lot of time and power on a model that ends up being discarded. Once you are sure the model is OK (including priors as @Dirk_Nachbar suggests), @sonicking 's suggestion is likely to help and you might also benefit from within-chain paralellization with reduce_sum if you have more cores to throw at the problem.

@martinmodrak My dataset is in fact very large 623,438 observations. But in this case, I was testing the model on a subset of 5,000 observations. So here N is 5,000, K is 3 and J is 15. The gradient times are large despite the subset. I tried fitting stronger priors such as normal(0,10) and normal(0,5) but it only increased the gradient time. So am not sure. I haven’t yet tried the optimized code where zeros and non-zeros are handled separately.

That’s still potentially quite large, so I would definitely try to prune it more. And I would check for issues with the model. Is the scale of all predictors sensible? Centering / scaling predictors can help. Have you ruled out that your predictors are collinear / almost collinear? If this is an issue, the QR reparametrization might also help: 1.2 The QR reparameterization | Stan User’s Guide