[Haskell-cafe] Space questions about intern and sets

John Meacham john at repetae.net
Fri Jun 3 16:45:54 EDT 2005


On Fri, Jun 03, 2005 at 04:02:09PM +0100, Duncan Coutts wrote:
> On Fri, 2005-06-03 at 10:53 +0200, Gracjan Polak wrote:
> > As intern behaves like id and does not have any side effects, I thought 
> > its interface should be purely functional. But I do not see any way to 
> > do it :( I'll end up with a monad, probably.
> 
> > In related question: does anybody here have experience/benchmarks/tests 
> > how/if is PackedString better (uses less memory) than String in parsing 
> > tasks?
> 
> GHC itself uses a rather low level thing it calls FastString which is
> basically a pointer into a character array with a length and a unique
> id. The unique ids are allocated by entering each FastString into a
> global hash table which also provides sharing if the same string is seen
> more than once (like your itern feature).
> 
> It is all very low level and ghc-specific however and probably only
> makes sence in a compiler-like application.

jhc has something very similar in its Atom and PackedString modules. The
advantages are that it always stores strings in UTF8 so the type is a
CPR type rather than a union and hence can be optimized much better. (in
particular it can be {-# UNPACK #-}ed. I have not done any formal
comparasons though. darcs also has its own similar thing which I believe
is faster but uses FFI calls to C code rather than beping pure
ghc-haskell. 
        John

-- 
John Meacham - ⑆repetae.net⑆john⑈ 


More information about the Haskell-Cafe mailing list