Because Stan samples from the posterior directly it has no information about the prior density or how it might generate samples from the prior distribution. That means that there is nothing that can be extracted from the posterior fit to inform a pure prior fit, and hence no gain from trying to sample both at the same time.
Recognizing that you have to fit two models anyways you have multiple options. You can build a single model with an if statement that turns off likelihood terms and get posterior and prior samples with two fits of that one model. Or you can build two models with a little bit of redundancy, one for the prior and one for the full joint model over data and parameters. Your solution is actually equivalent to this latter approach only with two models crammed back into the same (but now more expensive) MCMC fit.
I prefer specifying two models because in most cases the prior can be sampled from directly using PRNGs. Instead of having to run an MCMC fit the prior samples can be generated in the transformed data block or the generated quantities block and run in a fraction of the time that the posterior fit will take. This exact sampling program is also a critical aspect of simulating from the prior predictive distribution which is a vital part of a robust modeling workflow. See for example the sample_joint_ensemble
models in https://betanalpha.github.io/assets/case_studies/principled_bayesian_workflow.html.