Fit regression with no error at ends

Sean_Matthews · April 17, 2025, 7:00am

I am sure this question has already been asked, but a search didn’t turn it up, so apologies. Anyway, my basic problem is that I want to fit a regression line (in this case, a series of connected straight-line segments) to a curve so that the ends match up with the observed data.

If the usual construction would be

Y ~ normal(X * beta, sigma)

then (at the moment) I have

Y ~ normal(X * beta, E * sigma)

where E is a vector along X which is 1 in the middle, and goes to 0 at the ends.

This does work, but not quite perfectly (Stan starts to complain, and the better the endpoints meet up, the more Stan complains), so I was wondering if there is a better, canonical, way to do this.

Sean Matthews

js592 · April 17, 2025, 6:07pm

A possible trick might be to treat the unboserved latent states as parameters and declare the interior points as parameters, pass the boundaries as data, and use the transformed parameters block to append the boundary values.

Bridge Over Troubled Processes has an example

Though, philosophically, this does beg the question that if theory strongly predicts the curve must perfectly match some boundary values and the observed data does not agree are the hard constraints necessary?

Bob_Carpenter · April 17, 2025, 9:29pm

I’ve never seen anyone do this, so I doubt it. There’s a simple exact answer here. If you have n data points and sort it by x, then you achieve your stated goal exactly by taking the regression coefficients to be a slope of

\beta = (y_n - y_1) / (x_n - x_1)

and an intercept of

\alpha = y_1 - \beta \cdot x_1.

This exactly fits the endpoints of the data, but it ignores all the points in between.

May I ask why you want to specify an exact match to the observed data at the ends? For example, how is the data being generated? Is there some kind of measurement process that’s more accurate at the extremes?

Sean_Matthews · April 25, 2025, 8:08pm

I figured out a method - maybe not optimum but it works.
But I owe you an explanation, I think. The problem I have is as follows: I am trying to fit a line, that is almost a straight line, to a given final point.The line is almost but not quite linear. Essentially I want to model the residuals for the first order linear approximation. These residuals go necessarily to zero at the end points (which I know) - I want (need) to model the detailed structure of what comes in between.

avehtari · April 26, 2025, 6:50am

That explanation made it clear for me. In that case I would use Gaussian processes and depending on the number of observations and data model should use either exact or Hilbert space approximated. I’m not going to details as you already solved it

Topic		Replies	Views
Regression discontinuity with discrete latent variable (advice please) Modeling fitting-issues	1	497	June 13, 2020
Model Validation - Linear Regression with X and Y Uncertainty Modeling specification	4	1233	March 5, 2024
Linear regression with data-dependent error variances Modeling techniques	8	1357	August 1, 2024
Bayesian Regression General	2	176	September 2, 2024
Non-centered parameterization for basic spline Modeling specification	5	582	June 13, 2020

Fit regression with no error at ends

Related topics