As far as I understand, finding the posterior mode via optimization usually doesn’t work for hierarchical models since the objective function is unbounded due to the contribution from the prior, e.g.:
# parameters effects <- c(0.0, 0.0) effects_sd <- 0.0 # prior lpdf effects_prior_lpdf <- sum(dnorm(effects, 0.0, effects_sd, log = TRUE)) > print(effects_prior_lpdf)  Inf
so the maximum value of the objective function could be found at infinity via setting the all the hierarchical effects and scale to 0. In this case the solution effectively ignores any contribution to the objective function from the likelihood - which doesn’t sound particularly useful.
What I am struggling to get my head round is, why does sampling work fine? Why doesn’t the sampler eventually propose some value sufficiently close this unbounded mode and get stuck?
Any help would be awesome - thanks!