My dataset looks like a “hockey stick” where there’s a noise floor. At some stimulus level, called threshold, the data appears above the noise floor and increases linearly with stimulus level. The basic model is:
if x > threshold:
y = x * slope + intercept
else:
y = noise
This seems to work fine; however, it’s very slow with large datasets. I’m wondering if this is because I haven’t been able to vectorize the approach. Any suggestions?
data {
int<lower=1> n;
vector[n] x;
vector[n] y;
}
parameters {
real<lower=0> slope;
real<lower=-1,upper=1> noise;
real intercept;
real sigma;
real nu;
}
transformed parameters {
real threshold;
threshold = (noise-intercept)/slope;
}
model {
vector[n] mu;
noise ~ normal(0, 0.001);
slope ~ normal(0.0002, 0.001);
intercept ~ normal(-0.01, 0.01);
sigma ~ normal(0, 0.001);
nu ~ gamma(2, 0.1);
for (i in 1:n) {
mu[i] = fmax(noise, x[i]*slope + intercept);
}
y ~ student_t(nu, mu, sigma);
}