Do you think it is fair/legitimate to use the least square or optimization estimated parameter values to initialize the chain? I noticed in some cases the computation speeds up a lot if I do so. Following this point, since we use domain knowledge to set weakly informative priors for each parameter, then is it fair to initialized the chain with for instance the mean value of the corresponding prior? If it is fair to do so and it speeds up the computation, why don’t we always initialize with this strategy?
I ask this since I read in some literature people use the domain knowledge to initialize the chain of Gibbs sampling by using either expected parameter values or least square solutions, and claim that the warm-up samples will not be used eventually, then it is fair. However, I wonder this would increase the chance of reporting false positives?