That's not a bad idea. I had thought to maybe define a probability associated with a vector of locations in terms of an attraction force to individual Census regions and a repulsion force between the RBF centers and let Stan move the RBF centers around for me, but I think this is getting more fancy than needed for my model.
One question I have though is that I seem to get not very good mixing for the RBF coefficients, whereas the actual individual census region multipliers, which are basically Mult[i] = RBF(x[i],y[i]) + error_i with t distributed errors mixes just fine. Since it's this quantity which affects the predictions, it basically indicates to me that the smooth function I'm estimating is not that well identified (much of the variation is at a fine spatial scale within each metropolitan area for example). However, I don't actually need that smooth function to be well identified, it is after all basically a regularization device for the Mult[i] parameters.
So, how much should I care about things like Rhat or effective sample size of the RBF coefficients? Provided that I have Rhat ~ 1 for Mult[i] and good mixing in traceplots it seems that I should be good to go to use this information in prediction or explanation, and ignore the fact that the nuisance parameters of the regularization function struggles to converge.