Hi,
I have a multi level regression where I am trying to fit a set of data with a linear relationship between two sets of parameters. I’m continuing this over from my thread at Divergences and Quantifying HPD intervals, as my question is now different.
I have a model where I’m fitting some low level parameters to data for a set of experiments, let’s call these parameters k,D and H, each of which are N dimensional vectors fit to a set of x,y data (each of which has its own elements).
At a hierarchical level, the values of H should be dependent on k, according to the sampling statement:
H \sim \mathrm{Normal}(\alpha+\beta k,\sigma) where \alpha,\beta,\sigma are hyper parameters. \alpha is what I want to get an estimate for with this model.
Running this model just using this sampling statements results in divergences, especially when N is small, because I haven’t set a prior on the slope parameter \beta. Switching to the non centered parameterization as detailed here seems to remove these divergences, but causes the model to hit max tree depth.
If I use the QR decomposition on my k vector (treating it as a matrix) as outlined here then I no longer hit the maximum tree depth. It’s not clear that I’m sampling from the same distribution because each time I’m QR decomposing a different parameter matrix rather than a vector matrix, so I’m not sure the \theta parameter that I’m sampling from imposes any sort of consistent prior on \beta. However, my results haven’t varied significantly from my original model, so it seems to be working. Is it OK to perform this decomposition for a parameter vector in this way?