[Haskell-cafe] Very fast searching of byte strings

ChrisK haskell at list.mightyreason.com
Fri Aug 17 10:55:46 EDT 2007


Post the the library mailing list at [1] is the Boyer-Moore algorithm
implemented for strict and lazy bytestrings (and combinations thereof).  It
finds all the overlapping instances of the pattern inside the target.

I have performance tuned it.  But the performance for searching a strict
bytestring is better then for a lazy bytestring (even if they only had a single
strict chunk), which almost certainly means I was not clever enough to get GHC
to produce the optimal code.

There is much more description in the module's haddock header.

Hopefully Don or other ByteString experts/maintainers can tweak this even further.

Also at [1] is a Knuth-Morris-Pratt module which find non-overlapping
instances and is slightly slower on my benchmarks.

Happy Searching,
  Chris Kuklewicz

[1] http://www.haskell.org/pipermail/libraries/2007-August/007987.html



More information about the Haskell-Cafe mailing list