Brownian motion with measurement noise

Hello,

I am trying to create a brownian motion model in the presence of measurement noise, but I am having trouble fitting situations when the particle is not moving.

The brownian motion model is governed by the following equation:

x_i = x_{i-1}+\epsilon_i where \epsilon_i\sim \mathcal{N}(0,\sigma^2)

and we also include measurement noise at each timepoint, so that each measured observation corresponds to the true observation plus some normal noise:

y_i = x_i + \eta_i where \eta_i \sim \mathcal{N}(0,\sigma_c^2)

I implemented this with the following stan program

model_code='data {
  int<lower=1> N;               // size of dataset
  real<lower=0> measurementNoiseSigma;
  real x_measured[N];
}

parameters {
  real x_true[N];
  real<lower=0> sigma;
}

model {
  sigma ~ normal(50,50);
  x_true[2:N] ~ normal(x_true[1:N-1], sigma);
  x_measured ~ normal(x_true,measurementNoiseSigma);
}'

The data could be simulated as follows:

N <- 1000
sigma <- 30
measurementNoise <- 30
displacements <- rnorm(N,mean=0,sd=sigma)
trayectory <- cumsum(displacements)+rnorm(length(displacements),mean=0,sd=measurementNoise)

and the fitting works fine:

library(rstan)
stanFit <- rstan::stan(
    model_code = model_code,
    data = list(N=N,
                x_measured=trayectory,
                measurementNoiseSigma=measurementNoise),
    chains = 4,
    warmup = 2000,
    iter = 4000,
    cores = 4,
    refresh = 0,
    pars=c("sigma"))

However, as the signal becomes smaller (the sigma), then convergence becomes a problem. For example:

N <- 1000
sigma <- 5
measurementNoise <- 30
displacements <- rnorm(N,mean=0,sd=sigma)
trayectory <- cumsum(displacements)+rnorm(length(displacements),mean=0,sd=measurementNoise)

stanFit <- rstan::stan(
    model_code = model_code,
    data = list(N=N,
                x_measured=trayectory,
                measurementNoiseSigma=measurementNoise),
    chains = 4,
    warmup = 2000,
    iter = 4000,
    cores = 4,
    refresh = 0,
    pars=c("sigma"))

The issue is that these situations (sigma small) are of great scientific interest to us, as it reflects an immobile or very slowly moving particle, so it would be great if we could figure out a way to fit those as well as normal-moving particles.

Any suggestions on how this could be addressed?

When sigma is small you need to non-center the latent states,

parameters {
  real x_true_tilde[N];
  ...
}

model {
  real x_true[N];
  x_true[1] = x_true_tilde[1];
  for (n in 2:N)
    x_true[n] = x_true[n - 1] + sigma * x_true_tilde[n];

  x_true_tilde ~ normal(0, 1);
}

In general you’ll always want a prior on x_true[0].

4 Likes

@betanalpha thank you very much! I will try this…

I implemented a new model following @betanalpha 's suggestions and have the following:

model_code='data {
  int<lower=1> N;               // size of dataset
  real<lower=0> measurementNoiseSigma;
  real x_measured[N];
}

parameters {
  real<lower=0> sigma;
  real x_true_tilde[N];
}

transformed parameters {
  real x_true[N];
  x_true[1] = x_true_tilde[1];
  for (n in 2:N) {
    x_true[n] = x_true[n - 1] + sigma * x_true_tilde[n];
  }

}

model {
  sigma ~ normal(50,50);
  x_true_tilde ~ normal(0,1);
  x_measured ~ normal(x_true,measurementNoiseSigma);
}'

When I run it with a small sigma (e.g. 4) it works great! But when I run it with larger sigma (e.g. 30) I get a lot of max-tree-depth errors, more than I have ever seen:

It seems like now it is reversed, the non-centered parameterization does better with small sigma, whereas the centered parameterization does better with larger sigmas. Why is this? The inferences do seem to match, any suggestions?

@betanalpha talks about this phenomenon here:
https://youtu.be/uSjsJg8fcwY?t=4286

Linking to the start of the topic, but the “non-centered samples better with high-noise, centered samples better with low-noise” phenomenon is discussed around 1:19:00

5 Likes

Thank you @mike-lawrence!, just saw the video and I plan to view and review @betanalpha 's talk. If I understand correctly, I have two situations: my particle sometimes moves (high sigma, a lot of information, centered parameterization is better) or is stagnant (low sigma, less information about the parameter, non-centered parameterization is better). Yet I am trying to create a sampler that can deal with both situations as I don’t know which one it is going to be! What do people do in cases like these?

In the block of transformed parameters, the code x_true[1] = x_true_tilde[1] may be replaced with x_true[1] = sigma * x_true_tilde[1], because this is the case of n=1 in the equation x_true[n] = x_true[n - 1] + sigma * x_true_tilde[n]; with the implicit assumption x_true[0] = 0.

I have run the model with this correction, but there was no correction effect for the tree depth.

model_code='data {
  int<lower=1> N;               // size of dataset
  real<lower=0> measurementNoiseSigma;
  real x_measured[N];
}

parameters {
  real<lower=0> sigma;
  real x_true_tilde[N];
}

transformed parameters {
  real x_true[N];
  x_true[1] =  sigma * x_true_tilde[1];
  for (n in 2:N) {
    x_true[n] = x_true[n - 1] + sigma * x_true_tilde[n];
  }

}

model {
  sigma ~ normal(50,150);
  x_true_tilde ~ normal(0,1);
  x_measured ~ normal(x_true,measurementNoiseSigma);
}'

N <- 1000

sigma <- 30
measurementNoise <- 30
displacements <- rnorm(N,mean=0,sd=sigma)
trayectory <- cumsum(displacements)+rnorm(length(displacements),mean=0,sd=measurementNoise)
plot(1:length(trayectory),trayectory, type="l")

stanFit <- rstan::stan(
  model_code = model_code,
  data = list(N=N,
              x_measured=trayectory,
              measurementNoiseSigma=measurementNoise),
  chains = 4,
  warmup = 2000,
  iter = 4000,
  cores = 4,
  refresh = 0,
  pars=c("sigma"))



check_hmc_diagnostics(stanFit )



x_true <-get_posterior_mean(stanFit, "x_true")
plot(1:length(trayectory),trayectory, type="l")
suppressWarnings(graphics::par(new = TRUE))
plot(1:length(x_true[,1]),x_true[,1], type="l")












N <- 1000

sigma <- 4
measurementNoise <- 30
displacements <- rnorm(N,mean=0,sd=sigma)
trayectory <- cumsum(displacements)+rnorm(length(displacements),mean=0,sd=measurementNoise)
plot(1:length(trayectory),trayectory, type="l")

stanFit2 <- rstan::stan(
  model_code = model_code,
  data = list(N=N,
              x_measured=trayectory,
              measurementNoiseSigma=measurementNoise),
  chains = 4,
  warmup = 2000,
  iter = 4000,
  cores = 4,
  refresh = 0,
  pars=c("sigma"))

check_hmc_diagnostics(stanFit2 )


x_true <-get_posterior_mean(stanFit2, "x_true")
plot(1:length(trayectory),trayectory, type="l",ylim=c(min(trayectory), max(trayectory )  ) )
suppressWarnings(graphics::par(new = TRUE))
plot(1:length(x_true[,1]),x_true[,1], type="l",ylim=c(min(trayectory), max(trayectory )  ), col = "red", cex=6 )
1 Like

Thank you @Jean_Billie for pointing out about the n=1 and x_true[0] case, @betanalpha had mentioned about setting a prior for x_true[0] which I missed doing. Will keep working on this and if you think of anything else please let me know.

Most samplers cannot deal with both situations simulatenously and instead require careful parameterizations. For much more see Section 3 of Hierarchical Modeling.

1 Like

Thank you @betanalpha, I did go through section 3, thank you for this very valuable resource, I am sure I will re-read many times - if I had to do it all over again I would have done my PhD on these topics!. I am just wondering what people suggest doing in situations like this, i.e. where one of the things you don’t know is whether the parameter is very small (high noise) or larger (high information) and thus whether the centered or the non-centered parameterization is more suitable. I am thinking of running both samplers on the data and get the estimate from the converging one, it’s just a bit “ugly” for me. Any other ideas?

1 Like

As discussed in the case study you likely won’t know which parameterization is appropriate ahead of time. All you can do is start with a reasonable default, for example non-centering all of the parameters or non-centering everything but the parameters informed by many observations, and then check that initial fit. Stan’s Hamiltonian Monte Carlo implementation provides sensitive diagnostic feedback that can used to identify not only when the initial parameterization is suboptimal but also which parameters might need to be changed. Again see the case study for multiple examples of this diagnostic workflow.

Thank you very much @betanalpha. I will re-read the case-study for sure. Part of this is that I was hoping to use this in an automated way for many datasets so I was looking for a way to do that. In any case, I will study the document you wrote. Thank you again!

1 Like