Hey everyone new stan user here,
I am trying to implement a linear interpolation function into stan, but i am having some trouble finding an equivalent to the function “seachsorted” from the python package numpy (Link). Does anyone know about a similar implementation in stan? Or is there an other way to do linear interpolation specific to stan?
Here is my attempt at the interpolation function.
vector interpolate(vector x0,vector y0,vector x){
int nx = rows(x);
vector[nx] idx = numpy.searchsorted(x0, x);
vector[nx] dl = x - x0[idx - 1];
vector[nx] dr = x0[idx] - x;
vector[nx] d = dl + dr;
vector[nx] wl = dr ./ d;
vector[nx] interp = wl .* y0[idx - 1] + (1 - wl) * y0[idx];
return interp
}
Thank you for your time.
2 Likes
Hi, here’s a rough sketch I’ve been using for my own purposes. Comes with no warranty. @bbbales2 and @Bob_Carpenter can probably help shape this into usable code.
int which_min(real [] y ){
int ans = sort_indices_asc(y)[1];
return(ans);
}
real linear_interpolation(real x, real[] x_pred, real[] y_pred){
int K = size(x_pred);
real deltas [K];
real ans;
int i;
if(size(y_pred) != K) reject("x_pred and y_pred aren't of the same size");
for(k in 1:K) deltas[k] = fabs(x_pred[k] - x);
i = which_min(deltas);
if(i != 1){
real x1 = x_pred[i];
real x2 = x_pred[i + 1];
real y1 = y_pred[i];
real y2 = y_pred[i + 1];
ans = y1 + (y2-y1) * (x-x1)/(x2-x1);
}else{
ans = y_pred[i];
}
return(ans);
}
8 Likes
Hey @maxbiostat Thanks for the answer!
I have tried to modify the code to work with a vector input instead
int which_min(vector y ){
int ans = sort_indices_asc(y)[1];
return(ans);
}
vector interpolate(vector x_pred, vector y_pred,vector x){
int K = rows(x_pred);
vector[K] deltas;
int Kx = rows(x);
vector[Kx] ans;
int i;
if(rows(y_pred) != K) reject("x_pred and y_pred aren't of the same size");
for(k in 1:K){
deltas[k] = fabs(x_pred[k] - x[k]);
}
i = which_min(deltas);
if(i != 1){
real x1 = x_pred[i];
real x2 = x_pred[i + 1];
real y1 = y_pred[i];
real y2 = y_pred[i + 1];
ans = y1 + (y2-y1) * (x-x1)/(x2-x1);
}else{
ans[i] = y_pred[i];
}
return(ans);
}
it seems to run without an error, but i am not entirely sure if i formulated it correctly. Does it look like it makes sense?
EDIT: Just learned about to_array_1d() i think this does what i want.
2 Likes
Since this a statistical model, why not just define it as a spline or something?
@saudiwin i have been looking into different ways of doing 1-d interpolation, so i am open to using splines. I am using the interpolation to obtain values inbetween an evenly spaced array i define and do operations on and the interpolated results are then used as input for my likelihood(s).
Ok. Here’s slightly refactored vector-based function inspired by the PR#18 in stan-docs repo. It implements 3 different weight functions, as suggested by @nhuurre . This is linear interpolation, but I also have quadratic envelope method, which is even smoother, but it sacrifices the two outer bins.
functions{
real linear_interpolation_v(real x, vector x_pred, vector y_pred){
int K = rows(x_pred);
vector[K] deltas = x - x_pred;
real ans;
real t;
real w;
int i;
if(x<x_pred[1] || x>x_pred[K]) reject("x is outside of the x_pred grid!");
if(rows(y_pred) != K) reject("x_pred and y_pred aren't of the same size");
//this is which.min()
i = sort_indices_asc(fabs(deltas))[1];
if(deltas[i]<=0) i -= 1;
ans = y_pred[i];
real x1 = x_pred[i];
real x2 = x_pred[i + 1];
real y1 = y_pred[i];
real y2 = y_pred[i + 1];
ans = y1 + (y2-y1) * (x-x1)/(x2-x1);
t = (x-x1)/(x2-x1);
w = 1-t;
//w = 1/(1 + exp(1/(1-t) - 1/t));
//w = 1 - 3*pow(t,2) + 2*pow(t,3);
ans = w*y1 + (1-w)*y2;
return ans;
}
} // end functions block
Test it with the following R code
library(tidyverse)
set.seed(42) #set the correct seed
d <- tibble(
xs = 0:5,
ys=sample(10:15, 6, replace = TRUE),
type="known")
tibble(
xs=seq(0,5, by=0.001),
ys=sapply(xs, linear_interpolation_v, d$xs, d$ys),
type="predicted"
) %>%
bind_rows(d) %>%
ggplot()+
geom_point(aes(xs, ys, color=type))+
labs(title="w = ...")
3 Likes