Convexity of Single level hierarchical model

anon79882417 · October 2, 2019, 1:38pm

So we have a single level hierarchical model with gaussian priors, one global and p-local, to be explicit:
p(\theta | x) \propto p(x | \theta) p(\theta)

In this case, where for simplicity we’re assuming variance is known (which is not what we like):

p(\beta_0) \sim normal(0, \sigma^2_0)
p(\beta_j |\beta_0, \sigma_j^2) \sim normal(\beta_0, \sigma_j^2)
p(y | \beta_j, X) \sim normal(X \beta_j, \sigma^2)

p(x|\theta)p(\theta) becomes:

p(y_i | \beta_j, x) p(\beta_j | \beta_0) p(\beta_0)

Take log, drop normalization constant to get our objective function, we get:

\frac{\sum(y - X\beta_j)^2}{2\sigma_y^2} + \frac{(\beta_j - \beta_0)^2}{2\sigma_j^2} + \frac{\beta_0^2}{2\sigma_0^2}

We’d want to maximize the negative log-posterior.

And then taking a quick look at convexity, WRT to each parameter. Here I’m fuzzy. I have two inequalities:

A few questions:

I’ve done Gibbs sampler derivations a few times and in that case it’s easy to see how for the posterior variance can be a weighted average of global and local coefficients. Here, it was appealing to just take log because it looks so much like a standard optimization regularized regression problem. What am I doing wrong?
I’m given this inequality to check if something is convex, which makes sense when we have some arbitrary function and not many parameters but is not so clear as to what to do when I’m looking at a Bayesian model (with lots of parameters): Check to see that: f(\theta x +(1-\theta)y) \leq \theta f(x) + (1-\theta) f(y):

Ok, and, for simplicity ignoring the hierarchical prior in the objective function, the left hand side would turn out to be… ok we need x,y in dom f, so, I’m guessing I hold everything constant and only look at beta… how do I unpack this inequality exaclty? It’s not so clear when the function becomes more complex.

I’ll think a bit more about it.

Any recasting of this problem to make it more clear would be more appreciated. For example, one exercise just substituted an arbirtary line a quadratic equation and it was clear from “generalized” intuition from basic high school algebra/calculus that it was convex (positive leading coefficient and then even term). This was very easy to see with some basic matrix algebra.

What am I missing?

Edit:

first element of summand should be, excluding constant: (y-XB)^T (y - XB)

maxbiostat · October 2, 2019, 7:51pm

There are many things I don’t understand in your post. But here’s something you might have overlooked. If you want to know whether the objective function f is convex w.r.t a certain parameter \theta, you could try checking that \frac{\partial^2 f}{\partial\theta^2} is nonnegative on its entire domain. Of course, that requires f to be \mathcal{C}^2. If you need global convexity, then this notion generalises and you need to show that the Hessian is positive semidefinite.

anon79882417 · October 3, 2019, 7:57am

Thanks. It’s convex. Non negative weight sum of convex functions is convex.

These are all quadratic and leading positive coefficient and therefore convex.

Edit:

We can also just use simple conjugate intuition: Guassian prior * gaussian likelihood ~ gaussian posterior, chain in another gaussian and the posterior is still gaussian, which is convex.

Topic		Replies	Views
Asymmetric Gaussian Hierarchical Model Modeling	21	2117	October 6, 2017
Uncentered exponential priors and hyperpriors Modeling prior-choice , hierarchical-model	7	2082	March 24, 2023
Priors on derived quantities of model Modeling	4	861	March 2, 2022
Theory question: 2-stage hierarchical modelling Modeling	0	402	January 9, 2019
Running time problem Modeling	3	738	June 7, 2017

Convexity of Single level hierarchical model

Related topics