Ordinary least squares vs Stan: different results?

I have a least squares inference problem, and wanted to replace my Stan program with WLS. However, for some data, I get very different results, which I find very puzzling (see also here or here).

I am providing a specific example, with 29 data points, and the results of least squares and Stan. I simplified the problem to homoscadastic errors (yerr=1, i.e., OLS). In blue is a plot of the data:

I fit with three components: 1) a constant, 2) a linear component, 3) a fixed shape (red curve). The normalisations are free parameters called y0, k, norm. They receive flat, unconstrained priors.

Below is my code, data and analysis:

``````import numpy as np
import cmdstanpy
import matplotlib.pyplot as plt
import corner
from statsmodels.regression.linear_model import WLS

# data:

data = {'N': 29, 'x0': 5895.6,
'weight': np.array([7.8281132e-04, 2.0987201e-03, 5.2415314e-03, 1.2138436e-02,
2.6183469e-02, 5.2412681e-02, 9.7503543e-02, 1.6878714e-01,
2.7118218e-01, 4.0531352e-01, 5.6230277e-01, 7.2535461e-01,
8.6879617e-01, 9.6697074e-01, 9.9991941e-01, 9.6047860e-01,
8.5685223e-01, 7.1017146e-01, 5.4648513e-01, 3.9075232e-01,
2.5953308e-01, 1.5991287e-01, 9.1608725e-02, 4.8663702e-02,
2.4034550e-02, 1.1002056e-02, 4.6862061e-03, 1.8525344e-03,
6.7856628e-04]),
'x': np.array([5876.689 , 5878.041 , 5879.3965, 5880.7485, 5882.1045, 5883.458 ,
5884.8115, 5886.1685, 5887.5225, 5888.8804, 5890.235 , 5891.5933,
5892.948 , 5894.304 , 5895.6636, 5897.02  , 5898.3794, 5899.737 ,
5901.0967, 5902.4546, 5903.8125, 5905.174 , 5906.532 , 5907.894 ,
5909.2534, 5910.616 , 5911.9756, 5913.336 , 5914.699 ]),
'y': np.array([1.6952891 , 1.2575909 , 1.1748848 , 1.488376  , 1.8385803 ,
1.528486  , 1.1511977 , 0.9915795 , 1.1635164 , 1.4170816 ,
1.5301417 , 1.8289143 , 0.8856548 , 1.3066483 , 1.7767009 ,
1.2338594 , 0.93022835, 1.625354  , 1.4074879 , 1.1009187 ,
1.3825897 , 1.546863  , 0.99267375, 1.6045758 , 1.4572536 ,
0.8702313 , 0.68421644, 0.911552  , 1.1132364 ]),
'yerr': np.array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])}
data['dx'] = data['x'] - data['x0']

# plot the data
plt.plot(data['dx'], data['y'])
plt.plot(data['dx'], data['weight'])
plt.savefig('wls-data.pdf')
plt.close()

# run WLS
# prepare WLS matrix
A = np.column_stack((np.ones(data['N']), data['dx'], data['weight']))
wls_results = WLS(data['y'], A, data['yerr']**-2).fit()
# wls_results = OLS(data['y'], A).fit()   # this gives the same results
params = wls_results.params
cov_matrix = wls_results.cov_params()
y0_fit, k_fit, norm_fit = params
y0_err, k_err, norm_err = np.sqrt(np.diag(cov_matrix))

# run Stan
# write the Stan code to a file
stan_code = """
data {
int N;            // number of data points
vector[N] dx;     // x data points
vector[N] y;      // y data points
vector[N] yerr;   // y uncertainty
vector[N] weight; // weight describing feature shape
}
parameters {
real y0;     // continuum normalisation
real k;      // continuum slope
real norm;   // feature strength
}
model {
// assume Gaussian measurement errors
y ~ normal(y0 + k * dx + norm * weight, yerr);
}
"""
model_name = 'wls'
filename = 'wls.stan'
with open(filename, 'w') as fout:
fout.write(stan_code)

sm = cmdstanpy.CmdStanModel(stan_file=filename, model_name=model_name)
fit = sm.sample(data=data, iter_sampling=100000, iter_warmup=4000, chains=4, show_console=False, show_progress=True)
print(fit.summary())
print(fit.diagnose())
norm = fit.stan_variable('norm')
y0 = fit.stan_variable('y0')

# plot results:
corner.corner(
np.transpose([y0, fit.stan_variable('k'), norm]), labels=['y0', 'k', 'norm'],
plot_datapoints=False, plot_density=False, levels=np.exp(-np.arange(5)))
corner.corner(
np.random.multivariate_normal(params, cov_matrix, size=len(norm)), labels=['y0', 'k', 'norm'],
fig=plt.gcf(), color='b', plot_datapoints=False, plot_density=False, levels=np.exp(-np.arange(5)))
plt.savefig('wls_corner.pdf')

# compare means and errors of the two techniques:
print(' same norm?', norm_fit, norm.mean(), 'err:', norm_err, norm.std())
print(' same y0?', y0_fit, y0.mean(), 'err:', y0_err, y0.std())
``````

The Stan diagnostics are all ok, and Rhat is 1.

I get the same means (first two numbers), but much larger uncertainties with Stan:

`````` same norm? 0.11509463050765173 0.11374671140722921 err: 0.15857910773382997 0.5286261376859842
same y0? 1.2708374766432424 1.2714545692049999 err: 0.07524091396160146 0.25075389547048704
``````

Here is a pairs/corner plot comparing the results from Stan (black) to the least squares covariance (blue):

Why do the results not coincide in this setting?

Nevermind, I found the bug. I was using statsmodels incorrectly.

The covariance needs to be scaled by `wls_results.scale`.

Changing the line
`cov_matrix = wls_results.cov_params()`
to
`cov_matrix = wls_results.cov_params() / wls_results.scale`
gives consistent results between the Stan code and least squares.

Why is the model given by

y ~ normal(y0 + k * dx + norm * weight, yerr);

In particular, why are you multiplying `norm*weight` in the stan version? Usually, weighted least squares is equivalent to dividing the errors by the weight (in the WLS docs, this is described in the inverse way, “If you supply 1/W then the variables are pre- multiplied by 1/sqrt(W).” (I’m also unclear on your `weight` vs `sqrt(weight)`.)

Sorry, this is a poorly chosen variable name. I did not use `weight` in the Stan code for the errors. The errors are stored in `yerr`, which is 1 in this example. `weight` is the shape of one component (red curve in the first plot).