Combined additive AND multiplicative error likelihood

Hi there,

I was wondering if anyone has a tip about how to implement a likelihood which would express an error with an additive and a multiplicative components.

If the error was additive, I’d do Yobs~norm(Ymod,sigma) while if it was multiplicative I’d use something like Yobs~lognorm(log(Ymod),sigma)

How would you go on implementing the likelihood for something like Yobs = Ymod + (a + b•Ymod)•epsi where epsi is normally distributed?

Hope the question is not too basic. Many thanks in advance! Cheers


Can’t you just turn your model into an additive model?

\log{\left(y_\text{obs} - y_\text{mod}\right)} = \log{\left(a + b y_\text{mod}\right)} + \log{\epsilon}

You could then fit a model to obtain the posterior density of \log{\left(y_\text{obs} - y_\text{mod}\right)}.

Definitely not basic, but even if it was that’s totally fine!

Curious what you think of the suggestion from @mevers. It that won’t work for you let us know and we’ll try to help figure something else out.

1 Like

Ok, I was thinking about something like that. At the moment it doesnt seem to work and I’m still investigating why…

However I am wondering, if you define

ydelta = yobs-ymod

you have log(ydelta) = log(a+b*ymod)+log(epsilon), where LE=log(epsilon) will follow a normal distribution, but epsilon a lognormal one!

So I cannot simply write anymore in the model something like:

ydelta ~ lognormal( log(a + b ymod)},s)

where s would be something like a constant equal to 1

Am I missing something?

What about declaring a transformed yt in the model part, and then write in that same model section, something like:

yt = (yobs-ymod)/(a+b*ymod)

with a,b, and ymod all strictly positive, and then


Would that model correctly that combined error?

I had some time to think about your model. I’m not quite sure I follow your thoughts in the follow-up posts, but your original model

y_\text{obs} = y_\text{mod} + (a + b y_\text{mod})\epsilon

is akin to a heteroskedastic regression model with

\begin{aligned} y_{\text{obs},i} &= y_{\text{mod},i} + \eta_i\,,\quad\text{with}\\ \text{E}[\eta_i] &= 0\,,\\ \text{Var}[\eta_i] &= (a + b y_{\text{mod},i})^2\epsilon^2\,. \end{aligned}

Before translating above model to a Stan model, let’s generate and plot some sample data first.

ymod <- 1:100
a <- 2.5
b <- 0.3
yobs <- ymod + (a + b * ymod) * rnorm(length(ymod))

data.frame(yobs = yobs, ymod = ymod) %>%
	ggplot(aes(ymod, yobs)) +
	geom_point() +
	geom_line() + 

The nice thing with Stan is that the model can be translated to Stan code pretty much as-is:

model_code <- "
data {
	int<lower=0> N;
	vector[N] ymod;
	vector[N] yobs;
parameters {
	real<lower=0> a;
	real<lower=0> b;
	real<lower=0> sigma;
model {
	a ~ lognormal(0, 1);
	b ~ lognormal(0, 1);
	sigma ~ cauchy(0, 2.5);
	yobs ~ normal(ymod, (a + b * ymod) * sigma);
model <- stan_model(model_code = model_code)

I’ve chosen some weakly informative priors for all parameters; <lower=0> for a and b ensures that the scale parameter is \in \mathbb{R}^+.

Let’s fit the model:

fit <- sampling(
	data = list(N = length(ymod), ymod = ymod, yobs = yobs), 
	seed = 2020)

We inspect parameter estimates and compare with the actual values

fit %>% 
  tidy( = TRUE, rhat = TRUE, ess = TRUE) %>% 
  bind_cols(value = c(a, b, 1))
## A tibble: 3 x 8
#  term  estimate std.error conf.low conf.high  rhat   ess value
#  <chr>    <dbl>     <dbl>    <dbl>     <dbl> <dbl> <int> <dbl>
#1 a        2.31      2.47    0.631       9.99  1.01   578   2.5
#2 b        0.260     0.276   0.0754      1.07  1.01   372   0.3
#3 sigma    1.27      1.02    0.321       4.32  1.00   703   1

Not a bad fit, but note the wide (quantile) confidence/credible intervals.