Sorry being so unclear that you completely misunderstood what I tried to say. I know that we completely agree on this and it’s just now a problem with words. The joint parameter space (without any marginalization) has a shape of twisted flattened funnel. Most of funnels I have seen in real word have circular cross-section, but now this one is flattened because of the correlations between latent values and twisted because the correlations change and instead of being a 3D object it’s in higher dimensions and thinking of it twists your mind.
I completely agree, p(\theta \mid y) is not assumed to be Gaussian, but it is closer to Gaussian than the joint distribution (that’s what the marginalization does). I didn’t say that Gaussian would be a good approximation for that part, too, but since it’s closer to Gaussian, HMC/NUTS or CCD work much easily.
And since we agree on this, if you find the above still confusing you can ignore it or just describe in your own words about the shape of the joint distribution and why it’s more difficult than in 8 schools example.