isSpace is too slow

Yitzchak Gale gale at sefer.org
Mon May 21 04:22:22 EDT 2007


Duncan Coutts wrote:
> iswspace... We could short-cut that for ascii characters.

Also, '\t', '\n', '\r', '\f', and '\v' are contiguous. So

isSpace c =    c == ' '
            || c <= '\r' && c >= '\t'
            || c == '\xa0'
            || c > '\xff' && iswspace (fromIntegral (ord c)) /= 0

That makes 4 comparisons or less in the common
cases.

If we assume that ascii characters greater than '\xa0'
occur much less often than average, we can short-cut
those also and cut it down to 3 or less.

Note that the first space character above 255 is '\x1680'
(according to isSpace).

-Yitzchak


More information about the Glasgow-haskell-users mailing list