First note that Stan does not necessarily need to sample from conjugation priors:
The story of priors for variance parameters goes back into history:
http://www.stat.columbia.edu/~gelman/research/published/taumain.pdf
Half-student-t or half-normal are discussed here:
exponential(1.0) and half - student_t(4, 0, 1) are similar, the exponential(1.0)
is more efficient to calculate.