I usually think about error terms in terms of equations accounting for the total variance in an outcome. Accordingly, I find it more natural to use = rather than using ~ (i think this is the sampling operator?).
Assuming I want to model the mean of some vector y with given data block:
data{
vector y;
}
Is using = to write down an equation like below (b is a vector):
parameters{
real a;
vector<lower=0> b;
model{
y = a + b
// priors
a ~ normal(5,1);
b ~ normal(0,1);
}
Any different from using ~ like here (when b is a scalar)?
parameters{
real a;
real<lower=0> b;
}
model{
y ~ normal(a, b)
// priors
a ~ normal(5,1);
b ~ normal(0,1);
}
“y ~ normal(a, b)” is just shorthand for “target += normal_lupdf(y|a, b)”
(It may also be more optimized?)
So it’s a way of adding the log-likelihood (up to constants) of the data given the parameters to “target”, which is the log-likelihood of your data given values of the parameters, as well as of the parameters themselves, given your priors. Adding things to “target” is how you connect your modeled likelihood to the Monte-Carlo sampler.
“y = a + b” is an assignment and I’m not sure you can assign to data at all. Even if you can, this statement will not add anything to “target” and thus your log-likelihood here does not involve y at all. Though perhaps Stan does something special in this case—assigning to data—that I don’t know about?
Just to expand on @adamConnerSax’s answer: it’s really important to keep in mind that a Stan program doesn’t define some simulation but rather a joint density function over the parameters (all of the variables declared in the parameters block) and the data (all of the variables declared in the data and transformed data blocks). Consequently instead of the thinking about “error terms” or “adding noise” it’s more useful to think about what quantities are probabilistic/uncertain/vary and what probability density function quantifies that behavior.
For example one can interpret “outcome is given by corrupting an input with error” as a density function centered on that input, for example
\begin{align*}
\pi(y_{n}, a, b)
&= \pi(y \mid a, b) \pi(a, b)
\\
&= \text{normal}(y_{n} \mid a, b) \text{normal}(a \mid 5, 1) \text{normal}(b \mid 0, 1).
\end{align*}
Alternatively one can think about quantifying the variation in the difference between the output and the input,
\begin{align*}
\pi(y_{n}, a, b)
&= \pi(y - a, a, b)
\\
&= \pi(y - a \mid b) \pi(a, b)
\\
&= \text{normal}(y_{n} - a \mid 0, b) \text{normal}(a \mid 5, 1) \text{normal}(b \mid 0, 1).
\end{align*}
For the case of normal density functions these are equivalent programs but for non-normal variation they need not be the same and care is needed to determine which is most appropriate.