[Haskell-cafe] Re: String Hashing

apfelmus apfelmus at quantentunnel.de
Tue Jun 19 04:28:20 EDT 2007


Thomas Conway wrote:
> On 6/19/07, apfelmus <apfelmus at quantentunnel.de> wrote:
>>  Trie it is,
>>  not balanced tree.
>>  A logarithm in this
>>  would be new to me. :)
> 
> True enough, my braino.

So, accessing a key in a trie is O(key size in bits), not much different
from a hash table.

>> As a side node, Mr. Exp says: 64 is large enough for the size needs of
>> any logarithm.
> 
> Que?

A pun(y) formulation of the fact that (log n <= 64) for (almost) any
finite map situation you'll ever encounter because 2^64 = 16 Exabyte.

>> In any case, you could be able to set up an infinite trie and have lazy
>> evaluation allocate space as needed:
>>
>>  type Trie a = Branch (TVar a) (Trie a) (Trie a)
> 
> Tree-like structure's are quite hostile to highly concurrent manipulation.
> 
> It helps to introduce TVar indirections at each level:
> 
> data Trie a = Branch (TVar a) (TVar (Trie a)) (TVar (Trie a))
> 
> Then you can update a subtree without having to modify the spine of the
> tree.

What I wanted to point out is that the spine simply doesn't need to be
modified at all. In other words, you have an infinite tree residing in
memory and insertion/deletion happens by updating the TVars that hold
the values (and probably should be of type  TVar (Maybe a)  then).

Of course, there can't be infinite trees in memory :) but lazy
evaluation makes it appear as if. In fact, every branch will be modified
at most once before the first read and subsequent accesses are
read-only. I don't know whether or how this is implemented in concurrent
GHC at the moment, but I think that this can safely be implemented with
a lock whereas dead-lock means that the program denotes _|_ anyway.

Regard,
apfelmus



More information about the Haskell-Cafe mailing list