As any other choice in the model, you would need to refer to your domain knowledge to determine which distributions of initial states make sense. It is definitely not obvious to me that the stationary distribution is generally preferable. E.g. I’ve only ever used HMMs to model infectioys disease progression and there, the stationary distribution is that everybody’seither dead or fully cured - which would make little sense as initial state distribution.
Sometimes people just use a discrete uniform distribution - which is not great, but as long as your inferences are not sensitive to the true state in the couple first time points, the choice of initial distribution should not matter much.
You could also treat the initial distribution as a parameter to be estimated or even put some predictors on the initial state, but since the initial state tends to have little influence on the data (unless your HMM has very small transition probabilities), you are unlikely to learn much about it unless you observe a lot of individual series…
Does that answer your question?