Just vaguely looking at the paper, I think you’re right on the scaling. I just stole this from @flaxter’s repo linked at the top of the file, so I think we probly both gotta patch things. The parameters would just accommodate this scaling probly and it wouldn’t matter too much, but I think you’re right.
For the fhat, the things to stare at are Theorem 1 in the paper + Algorithm 1 + eq. https://wikimedia.org/api/rest_v1/media/math/render/svg/2ab83236f578bfbe15cf763fe25ab338c7af8aa5 on https://en.wikipedia.org/wiki/Reproducing_kernel_Hilbert_space#Integral_operators_and_Mercer.27s_theorem.
So the thing is we want to build a Mercer expansion of our Kernel (the Wikipedia thing). Instead of building the Kernel itself, we build a square root of it which the paper calls a feature map (Algorithm 1, we’re building z, where z * z’ = k instead of k). When you do things like this the code looks like we’re fitting stuff with basis functions instead of doing any GP thing (there’s no Choleskys to compute, since we directly construct the sqrt). The last thing is we have implicitly decided whatever kernel we’re going to work with has a true covariance that looks like k(x - z), since we’re writing it as a Fourier expansion (Theorem 1).
I think this is the way to interpret things at least… Haha
I ended up getting quite a bit out of the Fasshauer/McCourt Matlab book @flaxter linked too way back at the beginning of this. Took a while though. There’s a lotta stuff.
Btw, I think it’d be fine to make a new thread with this question and just link back to old stuff. This one is starting to get really long so it’d be hard for newcomers to the forums to make heads or tails of what is happening in it.