Those kappas seem really large. Is that right? Check a pairplot of kappa and mu. I’m curious if those are correlated.
I’d parameterize this problem differently. Maybe use normal distributions as the hierarchical priors and then use a link function on a binomial.
Check out the example in here: Hierarchical Partial Pooling for Repeated Binary Trials
If you see divergences, you’ll want to non-center that. That link shows how.
The latest in Rhat is keep it < 1.01 ([1903.08008] Rank-normalization, folding, and localization: An improved $\widehat{R}$ for assessing convergence of MCMC – this Rhat is more conservative than the one currently implemented in Stan).