Hi,

I’m new to Stan and Bayesian modeling in general, but I’m really liking Stan so far and learning a ton from the user’s guide and this community. I’ve been searching hard for some ideas about how to approach the problem outlined below, but haven’t found anything addressing it so far. I sincerely apologize if this has been covered elsewhere or if it’s a nonsensical question.

I am trying to construct a linear regression model to deconvolute processes in spectral data. I’m using simulated data (for which I know the noise level and the true coefficients) to test the model. I find that the model generally does a good job of estimating the true coefficient values for noisy datasets. However, as the noise level in the data approaches zero (which is common for this type of data), I start to get warnings about low E-BFMI, and the resulting posterior distribution of coefficients poorly approximates the true coefficients. My understanding of what’s happening is that as the noise level decreases, `sigma_tot`

(which aggregates several different noise contributions) all have high posterior density very close to zero, which prevents the sampler from efficiently sampling the posterior distribution of the coefficients, `beta_raw`

(similar to the Neal’s funnel example in the User’s guide). Instead, the sampler seems to get stuck in a local maximum. However, in this case, I can’t see a way to reparameterize the model to enable efficient sampling. Below is my model code:

```
data {
int<lower=0> N;
int<lower=0> K;
matrix[N, K] x;
vector[N] y;
vector[N/2] freq;
int<lower=0> N_tilde;
matrix[N_tilde,K] x_tilde;
vector[N_tilde/2] freq_tilde;
}
transformed data {
vector [N] hfr_vec = append_row(rep_vector(1,N/2), rep_vector(0,N/2));
vector [N] induc_vec = append_row(rep_vector(0,N/2), -2*pi()*freq);
vector [N_tilde] hfr_vec_tilde = append_row(rep_vector(1,N_tilde/2), rep_vector(0,N_tilde/2));
vector [N_tilde] induc_vec_tilde = append_row(rep_vector(0,N_tilde/2), -2*pi()*freq_tilde);
vector [N] y_mod;
y_mod = append_row(sqrt(square(y[1:N/2]) + square(y[N/2+1:])),sqrt(square(y[1:N/2]) + square(y[N/2+1:])));
}
parameters {
real<lower=0> hfr_raw;
real<lower=0> induc;
vector<lower=0>[K] beta_raw;
real<lower=0> sigma_res_raw;
real<lower=0> sigma_prop_raw;
real<lower=0> sigma_mod_raw;
vector<lower=0>[K] tau_raw;
}
transformed parameters {
real<lower=0> hfr = hfr_raw*100;
vector<lower=0>[K] tau = tau_raw*5;
vector<lower=0>[K] beta = tau .* beta_raw);
real<lower=0> sigma_res = sigma_res_raw*0.02;
real<lower=0> sigma_prop = sigma_prop_raw*0.02;
real<lower=0> sigma_mod = sigma_mod_raw*0.02;
vector[N] yhat = x*beta + hfr*hfr_vec + induc*induc_vec;
vector[N] sigma_tot = sigma_res + fabs(yhat)*sigma_prop + y_mod*sigma_mod;
}
model {
tau_raw ~ std_normal();
hfr_raw ~ std_normal();
induc ~ std_normal();
beta_raw ~ std_normal();
y ~ normal(yhat,sigma_tot);
sigma_res_raw ~ std_normal();
sigma_prop_raw ~ std_normal();
sigma_mod_raw ~ std_normal();
}
```

I’ve tried a few things to address this. I tried using heavy-tailed distributions (cauchy and student-t) for y (i.e. `y ~ cauchy(yhat,sigma_tot)`

), but when `sigma_tot`

is very small, the tails don’t help much. I’ve tried increasing `adapt_t0`

in the sampler control options, which helps marginally. I’ve also tried setting a minimum noise level by adding a data input, `sigma_min`

, that puts a floor on `sigma_tot`

to allow the sampler to escape local maxima even when the true noise level is low:

```
data {
...
real<lower=0> sigma_min;
}
...
transformed parameters {
...
vector[N] sigma_tot = sigma_min + sigma_res + fabs(yhat)*sigma_prop + y_mod*sigma_mod;
}
...
```

The `sigma_min`

approach does seem to help, but in many cases it results in suboptimal resolution of the coefficients, as the inflated `sigma_tot`

washes out the posterior density that would ideally push the estimated coefficients toward their true values.

Does anyone have ideas about how I could reparameterize the model to efficiently sample when the noise level is very low? It seems to me that this would be an issue for any regression model when noise in the outcome is low and the outcome is well-described by the predictors. Thanks for any advice or suggestions!