Say I have data generated from \vec{y}=\mathcal{N}(a*(m\vec{x}+b),\sigma). As an example, say a=10, m=1.4, b=0.7, and \sigma=0.2. I want my model in Stan to be unaware of a, and thus infer a, m, and b. The code will look like this:
data {
int N;
vector[N] y;
vector[N] x;
}
parameters {
real a;
real m;
real b;
real<lower=0> sigma;
}
model {
vector[N] y_true;
for (i in 1:N) {
y_true[i] = a * (m * x[i] + b);
}
y ~ normal(y_true,sigma);
}
Question is: Is it bad practice to have two model parameters multiplying each other? Because effectively the multiplication can be regarded as just one model parameter. In my case, I’m referring to a*m
and a*b
.
Observation: When I run the above code, it finishes fine, but it’s all wrong. Of course, in this trivial example, one would get rid of a
altogether in the Stan model and allow m
and b
to absorb a
. I am interested in the case when you cannot do this. This is only a toy example, but in my project, I want to be able to infer a
, because each data point is going to have its own multiplicative factor (image real a[N]
now, but the same m
and b
).
Update: I think I know why it would be a bad idea, and why this doesn’t work: there are infinitely many solutions, since we have in the above example 14=am! We can just tweak them over and over.