Hi,
I’m a newcomer to PyStan, and I am investigating using it to fit some XPS data. At the moment I am synthesising some data and making sure that my Stan model handles that well.
# Generate some synthetic data from the model.
N = 500
center=284
amp =1000
FWHM=2.
m=0.5
a=2.
b=1
f_true=0.02104 # a scale factor for adding the noise to the y data
x = np.sort(5*np.random.rand(N)+282)
yerr = 0.1+0.5*np.random.rand(N)
y = pV(m,x, center, FWHM,a,amp)
y += np.abs(f_true*y) *np.random.randn(N)
y += yerr * np.random.randn(N)
# convert the np arrays into list then to pandas dataframe
xpsdata = pd.DataFrame({'BindingEnergy': x, 'Counts': y})
model = """
functions {
real wx(real x, real w0,real a ){
// sigmoid function. x is E-E0, w0 is related to width of sigmoid (can be different to E0)
return ((2.*w0)/(1.+exp(-a*(x))));
}
real pV(real m, real E, real E0,real w0, real a,real amp){
// pseudo Voigt function
// m is mixing ratio of lorentzian to gaussian
// E0 is centre of peak
// E is energy of evaluation
// w0 is the centre of the sigmoid function
// a is asymmetry
real x=E-E0;
real w=wx(x,w0,a);
real w2=pow(w,2);
return (amp*((1.-m)*sqrt((4.*log(2))/(pi()*w2))*exp(x*x*(-4.*log(2))/w2) + (4.*m*w)/(2.*pi()*(w2 + x*x))));
}
}
data {
int<lower=1> N; // number of data points
vector[N] x; // x data
vector[N] y; // y data
}
parameters {
real<lower=280> center;
real<lower=100> amp;
// real<lower=0> sigma;
real<lower=0> sdev;
real<lower=0> a;
real<lower=0> w0;
real<lower=0> m;
}
transformed parameters {
vector[N] pVmodel;
for (i in 1:N){
pVmodel[i]= pV(m,x[i],center,w0,a,amp);
}
}
model {
y ~ normal(pVmodel , sdev);
amp ~ normal(1000, 100);
center ~ normal(284,2);
w0 ~ normal(2,.1);
sdev ~ normal(0,1000);
// note can put priors on the model parameters as above. The widths are important
a ~ normal(2,0.1);
m ~ normal(0.5,0.1);
}
"""
The model is called via:
fit = sm.sampling(data=data, iter=4000, chains=4, warmup=500, thin=1, seed=101)
When I run this on my MacBook Pro I get a decent fit to the data:
However, when I run the same code on a MacPro I get horrible fits:
(the solid line is the mean, the fainter lines are samples)
Both machines are running in new Conda environments with PyStan 2.18.0.0, and both have the latest version Mojave as the OS. The main difference is the hardware, the MacPro is rather old (mid 2010 model), the MacBook is 2017 vintage.
I’m sticking with the MacBook for now, but I’d like to understand why the same code gives such different results.
Thanks
Simon