Personal tools

Cookbook/Other data structures

From HaskellWiki

< Cookbook(Difference between revisions)
Jump to: navigation, search
 
(smaller headlines)
Line 1: Line 1:
 
GHC comes with some handy data-structures by default. If you want to use a Map, use [http://hackage.haskell.org/packages/archive/containers/latest/doc/html/Data-Map.html Data.Map]. For sets, you can use Data.Set. A good way to find efficient data-structures is to take a look at the hierarchical libraries, see [http://haskell.org/ghc/docs/latest/html/libraries/index.html Haskell Hierarchical Libraries] and scroll down to 'Data'.
 
GHC comes with some handy data-structures by default. If you want to use a Map, use [http://hackage.haskell.org/packages/archive/containers/latest/doc/html/Data-Map.html Data.Map]. For sets, you can use Data.Set. A good way to find efficient data-structures is to take a look at the hierarchical libraries, see [http://haskell.org/ghc/docs/latest/html/libraries/index.html Haskell Hierarchical Libraries] and scroll down to 'Data'.
   
= Map =
+
== Map ==
   
 
A naive implementation of a map would be using a list of tuples in the form of (key, value). This is used a lot, but has the big disadvantage that most operations take O(n) time.
 
A naive implementation of a map would be using a list of tuples in the form of (key, value). This is used a lot, but has the big disadvantage that most operations take O(n) time.
Line 22: Line 22:
 
Map is often imported <hask>qualified</hask> to avoid name-clashing with the Prelude. See [[Import]] for more information.
 
Map is often imported <hask>qualified</hask> to avoid name-clashing with the Prelude. See [[Import]] for more information.
   
= Set =
+
== Set ==
   
 
TODO
 
TODO
   
= Tree =
+
== Tree ==
   
 
TODO
 
TODO
   
= ByteString =
+
== ByteString ==
   
 
TODO
 
TODO

Revision as of 11:31, 9 July 2009

GHC comes with some handy data-structures by default. If you want to use a Map, use Data.Map. For sets, you can use Data.Set. A good way to find efficient data-structures is to take a look at the hierarchical libraries, see Haskell Hierarchical Libraries and scroll down to 'Data'.

Contents

1 Map

A naive implementation of a map would be using a list of tuples in the form of (key, value). This is used a lot, but has the big disadvantage that most operations take O(n) time.

Using Data.Map we can construct a fast map using this data-structure:

import qualified Data.Map as Map
 
myMap :: Map.Map String Int
myMap = Map.fromList [("alice", 111), ("bob", 333), ("douglas", 42)]

We can then do quick lookups:

bobsPhone :: Maybe Int
bobsPhone = Map.lookup "bob" myMap
Map is often imported
qualified
to avoid name-clashing with the Prelude. See Import for more information.

2 Set

TODO

3 Tree

TODO

4 ByteString

TODO

5 Arrays

Arrays are generally eschewed in Haskell. However, they are useful if you desperately need constant lookup or update or if you have huge amounts of raw data.

Immutable arrays like
Data.Array.IArray.Array i e
offer lookup in constant time but they get copied when you update an element. Use them if they can be filled in one go. The following example groups a list of numbers according to their residual after division by
n
in one go.
bucketByResidual :: Int -> [Int] -> Array Int [Int]
bucketByResidual n xs = accumArray (\xs x -> x:xs) [] (0,n-1) [(x `mod` n, x) | x <- xs]
 
Data.Arra.IArray> bucketByResidual 4 [x*x | x <- [1..10]]
array (0,3) [(0,[100,64,36,16,4]),(1,[81,49,25,9,1]),(2,[]),(3,[])]
 
Data.Arra.IArray> amap reverse it
array (0,3) [(0,[4,16,36,64,100]),(1,[1,9,25,49,81]),(2,[]),(3,[])]

Note that the array can fill itself up in a circular fashion. Useful for dynamic programming. Here is the Edit distance between two strings without array updates.

editDistance :: Eq a => [a] -> [a] -> Int
editDistance xs ys = table ! (m,n)
    where
    (m,n) = (length xs, length ys)
    x     = array (1,m) (zip [1..] xs)
    y     = array (1,n) (zip [1..] ys)
 
    table :: Array (Int,Int) Int
    table = array bnds [(ij, dist ij) | ij <- range bnds]
    bnds  = ((0,0),(m,n))
 
    dist (0,j) = j
    dist (i,0) = i
    dist (i,j) = minimum [table ! (i-1,j) + 1, table ! (i,j-1) + 1,
        if x ! i == y ! j then table ! (i-1,j-1) else 1 + table ! (i-1,j-1)]


Mutable arrays like
Data.Array.IO.IOArray i e
are updated in place, but they have to live in the IO-monad or the ST-monad in order to not destroy referential transparency. There are also diff arrays like
Data.Array.Diff.DiffArray i e
that look like immutable arrays but do updates in place if used in a single threaded way. Here is depth first search with diff arrays that checks whether a directed graph contains a cycle. Note: this example really belongs to Map or Set.
import Control.Monad.State
type Node  = Int
data Color = White | Grey | Black 
 
hasCycle :: Array Node [Node] -> Bool
hasCycle graph = runState (mapDfs $ indices g) initSeen
    where
    initSeen :: DiffArray Node Color
    initSeen  = listArray (bounds graph) (repeat White)
    mapDfs    = fmap or . mapM dfs
    dfs node  = get >>= \seen -> case (seen ! node) of
        Black -> return False
        Grey  -> return True  -- we found a cycle
        White -> do
            modify $  \seen -> seen // [(node,Grey )]
            found  <- mapDfs (graph ! node)
            modify $  \seen -> seen // [(node,Black)]
            return found