Multiplicative and additive noise: poor sampling

I’d like to model some data as having both multiplicative and additive noise. For example, say I want to
model observed x,y as y = (\mu + \sigma \epsilon)x + \sigma_0\nu where \epsilon,\nu are independent standard normals. In R we can create this kind of data as follows:

library(tidyverse)
library(rstan) # rstan_2.21.2
rstan_options("javascript" = FALSE) # deal with proxy
mu.true <- 2
sigma.true <- 1
sigma0.true <- 2

set.seed(1234)
tibble(x = runif(50, max = 10)) %>% 
  mutate(y = x*rnorm(n(), mean = mu.true, sd = sigma.true) + rnorm(n(), sd = sigma0.true)) ->
  Data

I have modelled this in Stan as follows:

data {
  int<lower=0> N;
  vector[N] x;
  vector[N] y;
}
parameters {
  real mu;
  real<lower=0> sigma;
  vector[N] y1;
  real<lower=0> sigma0;
}
model {
  y1 ~ normal(0, 1);
  y ~ normal(x .* (mu + sigma*y1), sigma0);
  mu ~ normal(0,10);
  sigma ~ cauchy(0,5);
  sigma0 ~ cauchy(0,5);
}

Which produces the following rstan output after calling print(stan_fit, pars = c('mu', 'sigma', 'sigma0')):

4 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=4000.

       mean se_mean   sd 2.5%  25%  50%  75% 97.5% n_eff Rhat
mu     1.84    0.01 0.19 1.45 1.71 1.84 1.96  2.21   481 1.00
sigma  1.19    0.01 0.16 0.91 1.08 1.18 1.28  1.52   267 1.01
sigma0 1.53    0.07 0.62 0.58 1.09 1.42 1.86  3.01    78 1.03

Samples were drawn using NUTS(diag_e) at Thu Oct 01 14:22:13 2020.

I get low n_eff, especially for sigma0. Can my Stan model be improved? (Or perhaps do I have deeper problems, such as with identifiability?)

I think since you have a normal likelihood and a normal prior on y1, you can integrate it out there and get:

y ~ normal(x .* mu, sqrt(sigma0^2 + x^2 * sigma^2));

Double check me on that, but that’ll avoid sampling the y1 parameters and probably work better.

Thanks — marginalizing works perfectly! A nice feature of coding it like this is that it clearly shows the effective noise size as a function of x.

The Stan code I actually used, incidentally, was y ~ normal(x * mu, sqrt(sigma0^2 + x .* x * sigma^2)) since there doesn’t seem to be a suitable pow(...) or .^.

1 Like

there doesn’t seem to be a suitable pow(...) or .^

This was actually added in the latest Stan release (2.24), so it’s available through the cmdstanR interface

1 Like