I implemented generalized linear models on GPU. Some discussion on GPU GLMS and graphs with speedups are in issue 1184. Then the idea came up, that GPU implementation can use
float calculations instead of
double to be even faster. As a prototype I implemented
floats. Here are the speedups compared to implementation with
doubles (K is number of attributes and N is number of instances):
An important thing to consder are also numerical errors due to reduced precision. So I compared both
double implementations with CPU implementation. On next graph there are maximum relative errors among logposterior and all derivatives. I generated three test cases for each size.
y is generated between 0 and 100, all other inputs between -1 and 1.
Now the question is: Do we want GPU implementations of GLMs to use
EDIT: added version with kahan summation