Gaussian Process with measurement uncertainty and a large number of observations

Since you were able to make the plot 1 showing data and GP curve, you can compute residuals

Probably. You could add additional intercept terms for the units with common population prior with scale as a parameter (a usual hierarchical intercept model). This would then explain the variation between units