I am completely new to stan. I simply wanted to fit a data which has uncertainty in measurements, but I could not include the uncertainty in the fitting. For example, I have x[N], y[N] and yerror[N] arrays with dimensions N. Suppose the data is 2nd order polynomial: y=a0+a1x+a2x*x, and I have the array yerror[N]. Now my code in pystan is given below:
import pystan
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
A0=0.5; A1=1.5; A2=-0.2; A3=-0.008; A4=0.00025
def func(xx):
return A0+A1*xx+A2*xx*xx#+A3*(x**3)+A4*(x**4)
x=10*np.random.rand(100); x=np.sort(x)
fx=func(x);
yerror=np.random.rand(len(x))
y=np.random.normal(fx,scale=yerror)
np.random.seed(101)
model = """
data {
int<lower=0> N;
vector[N] x;
vector[N] y;
}
parameters {
real a0;
real a1;
real a2;
real<lower=0> sigma;
}
model {
vector[N] x2;
for(i in 1:N){x2[i]=x[i]*x[i];}
y ~ normal(a0 + a1 * x + a2 * x2, sigma);
}
"""
# Put our data in a dictionary
data = {'N': len(x), 'x': x, 'y': y}
# Compile the model
sm = pystan.StanModel(model_code=model)
# Train the model and generate samples
fit = sm.sampling(data=data, iter=2000, chains=4, warmup=400, thin=3, seed=101)
The code does not use the measurement uncertainty, yerror[N]. How to do that?
Sorry if I am asking something silly/answered already. Thanks in advance.
Sorry, I found it very hard to understand. In reality I would just have data (y[N]), yerror[N], x[N] and I have to choose a model. How can I take the mean?
For example, in emcee (or most mcmc codes) we could simply include yerror[N] in the residual as: residual= (data-model)^2/yerror^2. I thought there would be something similar in stan.
Here the error in x is discussed. But what if I have error in y (that is the data wich we are trying to model). As I gave it a thought it should not be so complicated, all we have to do is to modify the chi^2/log_probability. Actually we have to divide the likelihood by yerror^2, that is it. Is there no such option in stan?