# Adding a vector of errors to an equation instead of using distribution functions: ~ f()

I usually think about error terms in terms of equations accounting for the total variance in an outcome. Accordingly, I find it more natural to use = rather than using ~ (i think this is the sampling operator?).

Assuming I want to model the mean of some vector y with given data block:

data{
vector y;
}


Is using = to write down an equation like below (b is a vector):

parameters{
real a;
vector<lower=0> b;

model{
y = a + b

// priors
a ~ normal(5,1);
b ~ normal(0,1);
}


Any different from using ~ like here (when b is a scalar)?

parameters{
real a;
real<lower=0> b;
}

model{
y ~ normal(a, b)

// priors
a ~ normal(5,1);
b ~ normal(0,1);
}


â€śy ~ normal(a, b)â€ť is just shorthand for â€śtarget += normal_lupdf(y|a, b)â€ť
(It may also be more optimized?)

So itâ€™s a way of adding the log-likelihood (up to constants) of the data given the parameters to â€śtargetâ€ť, which is the log-likelihood of your data given values of the parameters, as well as of the parameters themselves, given your priors. Adding things to â€śtargetâ€ť is how you connect your modeled likelihood to the Monte-Carlo sampler.

â€śy = a + bâ€ť is an assignment and Iâ€™m not sure you can assign to data at all. Even if you can, this statement will not add anything to â€śtargetâ€ť and thus your log-likelihood here does not involve y at all. Though perhaps Stan does something special in this caseâ€”assigning to dataâ€”that I donâ€™t know about?

1 Like

Just to expand on @adamConnerSaxâ€™s answer: itâ€™s really important to keep in mind that a Stan program doesnâ€™t define some simulation but rather a joint density function over the parameters (all of the variables declared in the parameters block) and the data (all of the variables declared in the data and transformed data blocks). Consequently instead of the thinking about â€śerror termsâ€ť or â€śadding noiseâ€ť itâ€™s more useful to think about what quantities are probabilistic/uncertain/vary and what probability density function quantifies that behavior.

For example one can interpret â€śoutcome is given by corrupting an input with errorâ€ť as a density function centered on that input, for example

\begin{align*} \pi(y_{n}, a, b) &= \pi(y \mid a, b) \pi(a, b) \\ &= \text{normal}(y_{n} \mid a, b) \text{normal}(a \mid 5, 1) \text{normal}(b \mid 0, 1). \end{align*}

Alternatively one can think about quantifying the variation in the difference between the output and the input,

\begin{align*} \pi(y_{n}, a, b) &= \pi(y - a, a, b) \\ &= \pi(y - a \mid b) \pi(a, b) \\ &= \text{normal}(y_{n} - a \mid 0, b) \text{normal}(a \mid 5, 1) \text{normal}(b \mid 0, 1). \end{align*}

For the case of normal density functions these are equivalent programs but for non-normal variation they need not be the same and care is needed to determine which is most appropriate.

1 Like