The initial values do work after all.
I used @avehtari’s trick to run the model with fixed parameters and the initial values were recovered in all test cases.
In my model, the wild departures from the initial estimates were due to the very high curvature near k = 0, and the fact that k is strongly correlated to p_0 and q_0. At least, that’s my conjecture.
I still do not understand why sometimes the first iteration (after the first transition) is exactly equal to the initial estimates. I observe this behavior in cases where there are no divergent transitions.