[Haskell-cafe] is 256M RAM insufficient for a 20 million element Int/Int map?

Thomas Hartman tphyahoo at gmail.com
Sat Oct 18 16:26:36 EDT 2008


{-# LANGUAGE BangPatterns #-}
import qualified Data.Map as M
import Debug.Trace
{-
I'm trying to run a HAppS web site with a large amount of data: stress
testing happstutorial.com.
Well, 20 million records doesn't sound that large by today's
standards, but anyway that's my goal for now.
I have a standard Data.Map.Map as the base structure for one of my
macid data tables (jobs), but I noticed something
that is probably causing problems for me.
Even a simple 20 million record with int/int key values causes an out
of memory error for me in ghci,
on a computer with 256M of ram.
I'm wondering if there is a data structure that might be more suitable
for large recordsets.
Or do you just have to use a database, or some sort of file-based
serialization, once your records
are in the millions?
Or is this some weird subtlety of lazy evalution, or some other haskell gotcha?
-}

size = 2 * 10^7

-- out of memory error
t = (M.! size) . myFromList . map (\i->(i,i)) $ [1..size]

-- Lists are no problem
{-
*Main> :! time ghc -e tL testMap.hs
(20000000,20000000)
3.38user 0.09system 0:03.53elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
-}
tL = (!! (size-1)) . map (\i->(i,i)) $ [1..size]

t2 = (M.fromList . map (\i->(i,i)) $ [1..10] )
       M.\\  (M.fromList . map (\i->(i,i)) $ [6..15])


-- does this evaluate all of list l, or just whnf?
myFromList (!l) = M.fromList l


More information about the Haskell-Cafe mailing list