[Haskell-cafe] ANN: bytestring-trie 0.1.0

Sterling Clover s.clover at gmail.com
Sun Dec 21 13:17:56 EST 2008


Thanks for working on this! A nice efficient bytestring-trie is the  
sort of data structure we should have had in Haskell for some time  
now, and I'm sure I'll be giving it a great deal of use.

Regards,
Sterl.

On Dec 20, 2008, at 1:06 AM, wren ng thornton wrote:

> --------------------------------------------
> -- Announcing: bytestring-trie 0.1.0 (beta)
> --------------------------------------------
>
> An efficient finite map from (byte)strings to values.
>
> The implementation is based on big-endian patricia trees, like  
> Data.IntMap. We first trie on the Word8 elements of a  
> Data.ByteString, sharing string prefixes where possible, and then  
> trie on the big-endian bit representation of those elements.  
> Patricia trees have efficient algorithms for union and other  
> merging operations, but they're also quick for lookups and insertions.
>
>
> --------------------------------------------
> -- Future Extensions
> --------------------------------------------
>
> * I've done spot checking to make sure it works, but haven't  
> constructed an extensive test suite. Does anyone know of a good  
> data set already out there for testing corner cases? I'm sure other  
> dictionary writers have come up with one.
>
> * A Word8 is much smaller than the architecture's natural Word  
> size. Therefore it'd be more efficient for the trie on bits to read  
> off a whole Word at a time. This work is in progress, but I need  
> help from people with 64-bit and big-endian machines in order to  
> verify the code works on those architectures.
>
> * Using ByteStrings allows for trieing on any packed byte  
> representation of a value, but they're not Strings. Integration  
> with Data.ByteString.Char8, utf8-string, and utf8-light should happen.
>
> * The current implementation also only accepts strict ByteStrings.  
> When chopping up strings to share prefixes we share the smaller  
> string. For very long strings when many deletions are expected,  
> this can still hold onto more memory than necessary. Accepting lazy  
> ByteStrings or adding heuristics for when to copy prefixes instead  
> of sharing will fix this.
>
>
> --------------------------------------------
> -- Links
> --------------------------------------------
>
> Homepage:
>     http://code.haskell.org/~wren/
>
> Hackage:
> http://hackage.haskell.org/cgi-bin/hackage-scripts/package/ 
> bytestring-trie
>
> Darcs:
>     http://code.haskell.org/~wren/bytestring-trie/
>
> Haddock (Darcs version):
> http://code.haskell.org/~wren/bytestring-trie/dist/doc/html/ 
> bytestring-trie/
>
> -- 
> Live well,
> ~wren
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe



More information about the Haskell-Cafe mailing list