Thank you both for your suggestions!
The issue was that label switching was not affecting posterior inference but affecting the convergence diagnostics. For instance, R-hat values looks good for a single chain. This was previously discussed in this post. In that post, it was suggested that post-hoc relabelling would not be appropriate if uncertainties between components are overlapping. I’d welcome any other suggestions if there are any other methods to estimate convergence across all 4 chains!
I removed the ordering on the \lambda as suggested!
I added the hard constraint simply to get convergence (even with one chain). I will remove the hard constraints and use stronger priors instead.