# Combined additive AND multiplicative error likelihood

Hi there,

I was wondering if anyone has a tip about how to implement a likelihood which would express an error with an additive and a multiplicative components.

If the error was additive, I’d do Yobs~norm(Ymod,sigma) while if it was multiplicative I’d use something like Yobs~lognorm(log(Ymod),sigma)

How would you go on implementing the likelihood for something like Yobs = Ymod + (a + b•Ymod)•epsi where epsi is normally distributed?

Hope the question is not too basic. Many thanks in advance! Cheers

Gianni

\log{\left(y_\text{obs} - y_\text{mod}\right)} = \log{\left(a + b y_\text{mod}\right)} + \log{\epsilon}

You could then fit a model to obtain the posterior density of \log{\left(y_\text{obs} - y_\text{mod}\right)}.

Definitely not basic, but even if it was that’s totally fine!

Curious what you think of the suggestion from @mevers. It that won’t work for you let us know and we’ll try to help figure something else out.

1 Like

Ok, I was thinking about something like that. At the moment it doesnt seem to work and I’m still investigating why…

However I am wondering, if you define

ydelta = yobs-ymod

you have log(ydelta) = log(a+b*ymod)+log(epsilon), where LE=log(epsilon) will follow a normal distribution, but epsilon a lognormal one!

So I cannot simply write anymore in the model something like:

ydelta ~ lognormal( log(a + b ymod)},s)

where s would be something like a constant equal to 1

Am I missing something?

What about declaring a transformed yt in the model part, and then write in that same model section, something like:

yt = (yobs-ymod)/(a+b*ymod)

with a,b, and ymod all strictly positive, and then

yt~std_normal()

Would that model correctly that combined error?

y_\text{obs} = y_\text{mod} + (a + b y_\text{mod})\epsilon

is akin to a heteroskedastic regression model with

\begin{aligned} y_{\text{obs},i} &= y_{\text{mod},i} + \eta_i\,,\quad\text{with}\\ \text{E}[\eta_i] &= 0\,,\\ \text{Var}[\eta_i] &= (a + b y_{\text{mod},i})^2\epsilon^2\,. \end{aligned}

Before translating above model to a Stan model, let’s generate and plot some sample data first.

ymod <- 1:100
a <- 2.5
b <- 0.3
set.seed(2020)
yobs <- ymod + (a + b * ymod) * rnorm(length(ymod))

library(tidyverse)
data.frame(yobs = yobs, ymod = ymod) %>%
ggplot(aes(ymod, yobs)) +
geom_point() +
geom_line() +
theme_minimal()


The nice thing with Stan is that the model can be translated to Stan code pretty much as-is:

library(rstan)
model_code <- "
data {
int<lower=0> N;
vector[N] ymod;
vector[N] yobs;
}
parameters {
real<lower=0> a;
real<lower=0> b;
real<lower=0> sigma;
}
model {
a ~ lognormal(0, 1);
b ~ lognormal(0, 1);
sigma ~ cauchy(0, 2.5);
yobs ~ normal(ymod, (a + b * ymod) * sigma);
}
"
model <- stan_model(model_code = model_code)


I’ve chosen some weakly informative priors for all parameters; <lower=0> for a and b ensures that the scale parameter is \in \mathbb{R}^+.

Let’s fit the model:

fit <- sampling(
model,
data = list(N = length(ymod), ymod = ymod, yobs = yobs),
seed = 2020)


We inspect parameter estimates and compare with the actual values

library(broom.mixed)
fit %>%
tidy(conf.int = TRUE, rhat = TRUE, ess = TRUE) %>%
bind_cols(value = c(a, b, 1))
## A tibble: 3 x 8
#  term  estimate std.error conf.low conf.high  rhat   ess value
#  <chr>    <dbl>     <dbl>    <dbl>     <dbl> <dbl> <int> <dbl>
#1 a        2.31      2.47    0.631       9.99  1.01   578   2.5
#2 b        0.260     0.276   0.0754      1.07  1.01   372   0.3
#3 sigma    1.27      1.02    0.321       4.32  1.00   703   1


Not a bad fit, but note the wide (quantile) confidence/credible intervals.