Hi,

I have some trouble understanding the information i get from the summary table. My parameters have \hat{R} values far away from unity while my transformed parameters appears to behave quite well.

Below I have a more in depth explanation of my approach, followed by the implemented .stan model and finally the summary table.

## Actual question

If anyone could tell me why only the transformed parameters behave well and why approach 3 so much slower than approach 1 and 2, it would be greatly appreciated.

Before anyone states it I am aware of the tree depth being saturated. But using method 1 or 2 never gets above leapfrog step three.

## In depth

I am trying to solve the following inverse problem: \textbf{d}=\textbf{Gm}

Here \textbf{d} (size n) is observations, \textbf{G} (size n\times m) data kernel and \textbf{m} (size m) the model.

The parameters in \textbf{m} are dependent and from some simulations, \textbf{M} (Each column is a realization of \textbf{m}), I have an approximate covariance matrix \Sigma_M.

I have sampled the following ways:

- Each element in \textbf{m} being independent and normally distributed
- \textbf{m} following multivariate normal distribution using \Sigma_M
- Lastly projecting \textbf{m} into eigenspace, using the eigen-decomposition of \Sigma_M.

With method 3 I ran into a problem.

I sample \textbf{q} in eigen-space and project it back to determine the likelihood.

\textbf{m} = \textbf{Tq}.*\sigma_M + \mu_M

Here \textbf{T} is the tranformation matrix to and from eigen-space, .* elementwise multiplication, \sigma_M and \mu_M standard deviation and mean of each row in \textbf{M}, respectively.

The scaling with \sigma_M and \mu_M is done because \textbf{M} is standardized before determining \Sigma_M.

I found that this approach worked really bad. And realized that much of the information was â€śtrappedâ€ť in \mu_M which was kept constant.

I then tried giving \mu_M a distribution. Which resulted in \textbf{m} having \hat{R}\approx1, but not \mu_M and \textbf{q}.

And also it required way to much computational time when compared to method 2.

## Model

```
data {
int<lower=0> L; // Length of data-array
int<lower=0> n; // SH-degree
int<lower=0> mMax; // Amount of model parameters
int<lower=0> p; // Amount of model truncated parameters
vector[L] d; // Declare size of data-array
matrix[L, mMax] G; // Declare size of design matrix
vector[mMax] igrfMean;
vector[mMax] igrfStd;
vector[mMax] orgStd;
matrix[mMax, p] T; // Transfer matrix
}
parameters {
vector[p] q; // Declare model array
vector[mMax] mu; // Mean variable
}
transformed parameters {
vector[mMax] m;
m = ((T*q) .* orgStd)+(mu .* igrfStd + igrfMean);
}
model {
vector[L] res; // Declare residual array
// Prior distribution of eigenspace
for (i in 1:p) {
q[i] ~ normal(0, 1);
}
// Prior distribution of mean
for (i in 1:mMax) {
mu[i] ~ normal(0, 1);
}
// Likelihood
res = d-G*m;
res ~ normal(0,1);
}
```

## Output

```
WARNING:pystan:n_eff / iter below 0.001 indicates that the effective sample size has likely been overestimated
WARNING:pystan:Rhat above 1.1 or below 0.9 indicates that the chains very likely have not mixed
WARNING:pystan:3859 of 4000 iterations saturated the maximum tree depth of 10 (96.5 %)
WARNING:pystan:Run again with max_treedepth larger than 10 to avoid saturation
WARNING:pystan:Chain 4: E-BFMI = 0.0599
WARNING:pystan:E-BFMI below 0.2 indicates you may need to reparameterize your model
Inference for Stan model: None_0a54820ed44399940f2d61834cc82db1.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
q[1] 1.71 0.04 0.06 1.67 1.67 1.68 1.75 1.82 2 69.36
q[2] 4.59 0.03 0.04 4.51 4.56 4.61 4.62 4.62 2 73.53
q[3] -1.34 0.02 0.03 -1.39 -1.36 -1.33 -1.33 -1.33 2 73.05
q[4] -0.26 0.03 0.04 -0.29 -0.29 -0.28 -0.24 -0.19 2 79.27
q[5] -7.73 0.01 0.02 -7.74 -7.74 -7.74 -7.72 -7.7 2 68.54
q[6] 1.52 0.04 0.06 1.48 1.48 1.49 1.56 1.63 2 66.2
q[7] -6.04 0.03 0.05 -6.07 -6.07 -6.06 -6.0 -5.95 2 73.13
mu[1] -0.18 0.9 1.28 -1.8 -1.25 -0.34 0.92 1.75 2 1249.7
mu[2] 2.5 3.34 4.73 -0.68 -0.61 0.01 6.1 10.67 2 1245.3
mu[3] 4.35 4.37 6.19 -1.03 0.2 1.77 8.96 14.9 2 502.12
mu[4] 122.34 115.22 163.1 8.89 17.22 38.15 235.88 418.68 2 68.87
mu[5] 1.86 3.16 4.47 -1.57 -1.08 -0.28 5.22 9.55 2 2346.3
mu[6] 6.41 7.78 11.01 -1.07 -0.42 0.61 14.39 25.45 2 1439.7
mu[7] -0.39 0.9 1.27 -1.63 -1.61 -0.63 0.88 1.34 2 1396.7
mu[8] -2884 62.72 88.78 -2945 -2941 -2929 -2822 -2723 2 74.26
m[1] -5.2e4 3.4e-5 2.1e-3 -5.2e4 -5.2e4 -5.2e4 -5.2e4 -5.2e4 3869 1.0
m[2] 3138.7 4.1e-5 2.6e-3 3138.7 3138.7 3138.7 3138.7 3138.7 3934 1.0
m[3] 680.29 4.1e-5 2.5e-3 680.29 680.29 680.29 680.3 680.3 3800 1.0
m[4] -172.9 2.6e-5 1.6e-3 -172.9 -172.9 -172.9 -172.9 -172.9 3816 1.0
m[5] -3074 3.1e-5 1.9e-3 -3074 -3074 -3074 -3074 -3074 3695 1.0
m[6] 5004.7 3.0e-5 1.9e-3 5004.7 5004.7 5004.7 5004.7 5004.7 3931 1.0
m[7] -715.2 3.5e-5 2.2e-3 -715.2 -715.2 -715.2 -715.2 -715.2 4094 1.0
m[8] 1198.8 3.4e-5 2.2e-3 1198.8 1198.8 1198.8 1198.8 1198.8 4253 1.0
lp__ -5.2e4 201.52 285.2 -5.2e4 -5.2e4 -5.2e4 -5.1e4 -5.1e4 2 72.49
```