# Transformed parameters converge

#1

Hi,

I have some trouble understanding the information i get from the summary table. My parameters have \hat{R} values far away from unity while my transformed parameters appears to behave quite well.
Below I have a more in depth explanation of my approach, followed by the implemented .stan model and finally the summary table.

## Actual question

If anyone could tell me why only the transformed parameters behave well and why approach 3 so much slower than approach 1 and 2, it would be greatly appreciated.
Before anyone states it I am aware of the tree depth being saturated. But using method 1 or 2 never gets above leapfrog step three.

## In depth

I am trying to solve the following inverse problem: \textbf{d}=\textbf{Gm}
Here \textbf{d} (size n) is observations, \textbf{G} (size n\times m) data kernel and \textbf{m} (size m) the model.
The parameters in \textbf{m} are dependent and from some simulations, \textbf{M} (Each column is a realization of \textbf{m}), I have an approximate covariance matrix \Sigma_M.

I have sampled the following ways:

1. Each element in \textbf{m} being independent and normally distributed
2. \textbf{m} following multivariate normal distribution using \Sigma_M
3. Lastly projecting \textbf{m} into eigenspace, using the eigen-decomposition of \Sigma_M.

With method 3 I ran into a problem.
I sample \textbf{q} in eigen-space and project it back to determine the likelihood.
\textbf{m} = \textbf{Tq}.*\sigma_M + \mu_M
Here \textbf{T} is the tranformation matrix to and from eigen-space, .* elementwise multiplication, \sigma_M and \mu_M standard deviation and mean of each row in \textbf{M}, respectively.
The scaling with \sigma_M and \mu_M is done because \textbf{M} is standardized before determining \Sigma_M.

I found that this approach worked really bad. And realized that much of the information was “trapped” in \mu_M which was kept constant.
I then tried giving \mu_M a distribution. Which resulted in \textbf{m} having \hat{R}\approx1, but not \mu_M and \textbf{q}.
And also it required way to much computational time when compared to method 2.

## Model

data {
int<lower=0> L;	// Length of data-array
int<lower=0> n;	// SH-degree
int<lower=0> mMax;	// Amount of model parameters
int<lower=0> p;	// Amount of model truncated parameters
vector[L] d;		// Declare size of data-array
matrix[L, mMax] G;	// Declare size of design matrix
vector[mMax] igrfMean;
vector[mMax] igrfStd;
vector[mMax] orgStd;
matrix[mMax, p] T;	// Transfer matrix
}

parameters {
vector[p] q;	// Declare model array
vector[mMax] mu; // Mean variable
}

transformed parameters {
vector[mMax] m;
m = ((T*q) .* orgStd)+(mu .* igrfStd + igrfMean);
}

model {
vector[L] res;		// Declare residual array

// Prior distribution of eigenspace
for (i in 1:p) {
q[i] ~ normal(0, 1);
}

// Prior distribution of mean
for (i in 1:mMax) {
mu[i] ~ normal(0, 1);
}

// Likelihood
res = d-G*m;
res ~ normal(0,1);
}


## Output

WARNING:pystan:n_eff / iter below 0.001 indicates that the effective sample size has likely been overestimated
WARNING:pystan:Rhat above 1.1 or below 0.9 indicates that the chains very likely have not mixed
WARNING:pystan:3859 of 4000 iterations saturated the maximum tree depth of 10 (96.5 %)
WARNING:pystan:Run again with max_treedepth larger than 10 to avoid saturation
WARNING:pystan:Chain 4: E-BFMI = 0.0599
WARNING:pystan:E-BFMI below 0.2 indicates you may need to reparameterize your model
Inference for Stan model: None_0a54820ed44399940f2d61834cc82db1.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.

mean se_mean     sd   2.5%    25%    50%    75%  97.5%  n_eff   Rhat
q[1]    1.71    0.04   0.06   1.67   1.67   1.68   1.75   1.82      2  69.36
q[2]    4.59    0.03   0.04   4.51   4.56   4.61   4.62   4.62      2  73.53
q[3]   -1.34    0.02   0.03  -1.39  -1.36  -1.33  -1.33  -1.33      2  73.05
q[4]   -0.26    0.03   0.04  -0.29  -0.29  -0.28  -0.24  -0.19      2  79.27
q[5]   -7.73    0.01   0.02  -7.74  -7.74  -7.74  -7.72   -7.7      2  68.54
q[6]    1.52    0.04   0.06   1.48   1.48   1.49   1.56   1.63      2   66.2
q[7]   -6.04    0.03   0.05  -6.07  -6.07  -6.06   -6.0  -5.95      2  73.13
mu[1]  -0.18     0.9   1.28   -1.8  -1.25  -0.34   0.92   1.75      2 1249.7
mu[2]    2.5    3.34   4.73  -0.68  -0.61   0.01    6.1  10.67      2 1245.3
mu[3]   4.35    4.37   6.19  -1.03    0.2   1.77   8.96   14.9      2 502.12
mu[4] 122.34  115.22  163.1   8.89  17.22  38.15 235.88 418.68      2  68.87
mu[5]   1.86    3.16   4.47  -1.57  -1.08  -0.28   5.22   9.55      2 2346.3
mu[6]   6.41    7.78  11.01  -1.07  -0.42   0.61  14.39  25.45      2 1439.7
mu[7]  -0.39     0.9   1.27  -1.63  -1.61  -0.63   0.88   1.34      2 1396.7
mu[8]  -2884   62.72  88.78  -2945  -2941  -2929  -2822  -2723      2  74.26
m[1]  -5.2e4  3.4e-5 2.1e-3 -5.2e4 -5.2e4 -5.2e4 -5.2e4 -5.2e4   3869    1.0
m[2]  3138.7  4.1e-5 2.6e-3 3138.7 3138.7 3138.7 3138.7 3138.7   3934    1.0
m[3]  680.29  4.1e-5 2.5e-3 680.29 680.29 680.29  680.3  680.3   3800    1.0
m[4]  -172.9  2.6e-5 1.6e-3 -172.9 -172.9 -172.9 -172.9 -172.9   3816    1.0
m[5]   -3074  3.1e-5 1.9e-3  -3074  -3074  -3074  -3074  -3074   3695    1.0
m[6]  5004.7  3.0e-5 1.9e-3 5004.7 5004.7 5004.7 5004.7 5004.7   3931    1.0
m[7]  -715.2  3.5e-5 2.2e-3 -715.2 -715.2 -715.2 -715.2 -715.2   4094    1.0
m[8]  1198.8  3.4e-5 2.2e-3 1198.8 1198.8 1198.8 1198.8 1198.8   4253    1.0
lp__  -5.2e4  201.52  285.2 -5.2e4 -5.2e4 -5.2e4 -5.1e4 -5.1e4      2  72.49
0 Likes