Understanding behavior of independent non-data interacting parameter

MLK97 · March 7, 2022, 6:23pm

For a current research project, I am trying to approximate a model with the Ornstein-Uhlenbeck process.
To do this I use Bayesian inference with the help of Stan.
However, I am relatively new to Stan and Bayesian inference and I am currently struggling with a certain behavior of a prior.
I won’t include the full Stan-Model for the Ornstein-Uhlenbeck process and instead concentrate on a minimal model that still shows the behavior that is puzzling to me.

I have data for x, which describes the movement of a sample. One characteristic parameter for this process is the characteristic time \tau, which is implicitly connected to the log-likelihood for x through the mean in the likelihood function.

However, the question I now have is more fundamental and I wonder what the following Stan code actually does.
I am feeding it a simplified dataset x which is just a list with values 1 to 100

data {
  int<lower=0> t;
  vector[t] x;
}

parameters {
  real<lower=0> tau; // dummy parameter
}

model {
  print("tau = ", tau);
  tau ~ normal(100, 1); 
  x ~ normal(6, 3);
}

In this example, I have the parameter \tau which is completely independent of the data x.
Now, I feel like I am misunderstanding the line tau ~ normal(100, 1) .

How I understand it is that after the model is run \tau should be a normal-distributed parameter with a mean of 100 and a sigma of 1.
What I get instead is that \tau has a mean of 55.69 and a sigma of 29.09.

Also when running 50 iterations with the data x the print statement in the STAN model gives me values like

Chain 1: tau = 4.11202
Chain 1: tau = 4.16449
Chain 1: tau = 4.16449
Chain 1: tau = 4.25171
Chain 1: tau = 4.25171
Chain 1: tau = 4.38693

I spent the day understanding how those mean and sigma values are found and what that print statement actually means but after digging through the documentation I just seem to lack the knowledge of what I have to look for.

I am thankful for any help.

jtimonen · March 7, 2022, 8:44pm

The target distribution for this model is truncated normal, because you have lower bound 0 for tau. But the mean should be close to 100 after convergence (truncation doesnt move it by much because variance small compared to mean). With 50 iterations you cannot expect it to have converged. You should remove the statement for x ~ normal
because it will always be constant. The print statements are printing the value of tau at different stages of the mcmc algorithm.

MLK97 · March 7, 2022, 8:56pm

Thank you @jtimonen
I understand that I would need more iterations for tau to converge.
However, I don’t understand how the initial value for tau gets picked then, especially since values far away from 100 should be really unlikely for tau with the given normal distribution.

jtimonen · March 7, 2022, 9:07pm

Initial value is by default drawn uniformly from [-2, 2] on the unconstrained scale (log scale for tau in this case). The tau ~ normal statement has no effect on initialization.

MLK97 · March 7, 2022, 9:16pm

Oh. I was not aware of this. Thank you

Topic		Replies	Views
Looking for help to improving performance of Ornstein Uhlenbeck model Modeling techniques , fitting-issues , performance	3	695	April 27, 2022
Bayesian inference in Stan Modeling	40	5807	July 18, 2017
Explaining how Rstan estimates parameter using ODE and data.! General	25	1369	April 1, 2021
Priors for binomial trials in generalized linear models Modeling	3	512	July 6, 2018
What is actually happening when sampling ordered parameters? Modeling	9	841	November 16, 2020

Understanding behavior of independent non-data interacting parameter

Related topics