Announcing Very Fast Searching of ByteStrings

Duncan Coutts duncan.coutts at worc.ox.ac.uk
Fri Aug 17 10:19:50 EDT 2007


On Fri, 2007-08-17 at 14:49 +0100, ChrisK wrote:
> Attached is the Boyer-Moore algorithm implemented for strict and lazy
> bytestrings (and combinations thereof).  It finds all the overlapping instances
> of the pattern inside the target.

Fantastic.

> I have performance tuned it. 

Even better :-)

> But the performance for searching a strict bytestring is better then
> for a lazy bytestring (even if they only had a single strict chunk),
> which almost certainly means I was not clever enough to get GHC to
> produce the optimal code.

Lazy bytestrings have one more indirection to get at the actual data.
I'm planning to eliminate this extra indirection at some point so it'll
be interesting to see if that help searching significantly.

> There is much more description in the module's haddock header.
> 
> Hopefully Don or other ByteString experts/maintainers can tweak this even further.

Yes, I'll look at getting this into the bytestring package (though not
for about a week as I'm away).

Duncan



More information about the Libraries mailing list