Thanks for uploading the data! :)

Well, lets stick to your first model for now - Iām actually not quite sure what you want to achieve with these other models, but letās start simple. Maybe we can add more complexity later.

So, just to be explicit:

\begin{aligned}
x_1^* &\sim \text{Normal}(x_1,\tau_1) \\
x_2^* &\sim \text{Normal}(x_2,\tau_2) \\
y &\sim \text{Normal}(\beta_0 + \beta_1 x_1^* + \beta_2 x_2^*,\tau_0),
\end{aligned}

where x_1,x_2,y, and \tau_0,\tau_1,\tau_2 are observed. The predictors x_1^*,x_2^* are inferred, i.e. theyāre assumed to be normally distributed around the observed x with a random error with standard deviation \tau. The outcome is assumed to be normally distributed with *known* standard deviation. The regression coefficients \beta_0,\beta_1,\beta_2 are *pooled* (i.e. *not* hierarchical).

Thatās basically the model, that you posted in your initial post, right? I re-wrote the code slightly (mostly vectorization and non-centered parameterization on x_1^*,x_2^*). I also dropped the `transformed data`

block, since \tau > 0 for all observed \tau anyways (even if not, thisāll be checked by the `<lower=0>`

restriction in the `data`

block). I also changed the priors on beta, becauseā¦ idk, they were not that informative anyways, but you can of course change them back/adjust.

This is the model:

```
data {
int<lower=0> nobs;
vector[nobs] y;
vector<lower=0>[nobs] tau0;
vector<lower=0>[nobs] tau1;
vector<lower=0>[nobs] tau2;
vector[nobs] x1;
vector[nobs] x2;
}
parameters {
vector[nobs] x1_tilde;
vector[nobs] x2_tilde;
real beta0;
real beta1;
real beta2;
}
transformed parameters{
vector[nobs] x1_ast = x1 + (tau1 .* x1_tilde);
vector[nobs] x2_ast = x2 + (tau2 .* x2_tilde);
vector[nobs] mu = beta0 + beta1*x1_ast + beta2*x2_ast;
}
model {
x1_tilde ~ std_normal();
x2_tilde ~ std_normal();
y ~ normal(mu, tau0);
beta0 ~ normal(0,10);
beta1 ~ normal(0,10);
beta2 ~ normal(0,10);
}
```

This runs relatively smoothly.

```
> rstan::get_elapsed_time(posterior)
warmup sample
chain:1 9.836 10.381
chain:2 10.786 10.072
chain:3 9.543 10.104
chain:4 9.773 10.116
```

Results (excluding `x1_tilde, x2_tilde, x1_ast, x2_ast, mu`

) are:

```
> print(posterior, c("x1_tilde", "x2_tilde", "x1_ast", "x2_ast", "mu"), include = FALSE)
Inference for Stan model: stan_forum_model.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
beta0 -4.62 0.04 0.58 -5.86 -4.97 -4.60 -4.23 -3.53 246 1.01
beta1 -3.26 0.16 2.17 -7.45 -4.73 -3.23 -1.83 0.97 189 1.02
beta2 -7.88 0.03 0.56 -9.07 -8.21 -7.85 -7.50 -6.90 327 1.01
lp__ -448.76 0.49 12.02 -473.13 -456.93 -448.51 -440.42 -426.26 592 1.00
```

(I have *no* clue if these values make sense. Sorry, I donāt really understand the science hereā¦)

Diagnostics look fine, but `n_eff`

is fairly low for the betas. Iām guessing that making the model more complex will be difficult. Do it step by step and try to see (if at all) where stuff breaks.

That being saidā¦ I tried to run the model with \tau_0 as parameter, with prior \tau_0 \sim \text{Exponential}(1) and it ran much betterā¦ Hereās the model code and results:

```
data {
int<lower=0> nobs;
vector[nobs] y;
vector<lower=0>[nobs] tau_1;
vector<lower=0>[nobs] tau_2;
vector[nobs] x1;
vector[nobs] x2;
}
parameters {
vector[nobs] x1_tilde;
vector[nobs] x2_tilde;
real<lower=0> tau_0;
real beta0;
real beta1;
real beta2;
}
transformed parameters{
vector[nobs] x1_ast = x1 + (tau_1 .* x1_tilde);
vector[nobs] x2_ast = x2 + (tau_2 .* x2_tilde);
vector[nobs] mu = beta0 + beta1*x1_ast + beta2*x2_ast;
}
model {
x1_tilde ~ std_normal();
x2_tilde ~ std_normal();
y ~ normal(mu, tau_0);
tau_0 ~ exponential(1);
beta0 ~ normal(0,10);
beta1 ~ normal(0,10);
beta2 ~ normal(0,10);
}
```

```
> get_elapsed_time(posterior_2)
warmup sample
chain:1 0.876 0.553
chain:2 0.909 0.629
chain:3 0.976 0.420
chain:4 0.923 0.382
```

```
> print(posterior_2, c("x1_tilde", "x2_tilde", "x1_ast", "x2_ast", "mu"), include = FALSE)
Inference for Stan model: stan_forum_model-2.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
tau_0 2.07 0.00 0.15 1.80 1.96 2.06 2.16 2.40 3811 1
beta0 0.11 0.01 0.57 -0.98 -0.27 0.11 0.50 1.24 1973 1
beta1 -7.99 0.05 2.53 -12.97 -9.70 -8.00 -6.27 -2.99 2453 1
beta2 -0.19 0.01 0.31 -0.82 -0.39 -0.21 0.01 0.40 1064 1
lp__ -224.71 0.27 10.21 -245.76 -231.34 -224.32 -217.46 -205.79 1445 1
```

Betas are all different nowā¦ Hmmā¦ Idk whatās up here, but the estimated \tau_0 is much higher than the āobservedā values for \tau_0 in the first model.

Anywaysā¦ I hope that gives you some starting point. Feel free to ask, if anything in the code is unclear. :)

Cheers!