[Haskell-cafe] Re: Suspected stupid Haskell Question

Thomas Hartman thomas.hartman at db.com
Wed Oct 17 21:48:28 EDT 2007


Since I'm interested in the stack overflow issue, and getting acquainted 
with quickcheck, I thought I would take this opportunity to compare your 
ordTable with some code Yitzchak Gale posted earlier, against Ham's 
original problem.

As far as I can tell, they're the same. They work on lists up to 100000 
element lists of strings, but on 10^6 size lists I lose patience waiting 
for them to finish. 

Is there a more scientific way of figuring out if one version is better 
than the other by using, say profiling tools?

Or by reasoning about the code?

t.

****************************************

import Data.List
import qualified Data.Map as M
import Control.Arrow
import Test.QuickCheck
import Test.GenTestData
import System.Random

{-
Is there a library function to take a list of Strings and return a list of
ints showing how many times each String occurs in the list.

So for example:

["egg", "egg", "cheese"] would return [2,1] 
-}

testYitzGale n = do
  l <- rgenBndStrRow (10,10) (10^n,10^n)  -- 100000 strings, strings are 
10 chars long, works. craps out on 10^6.
  m <- return $ freqFold l 
  putStrLn $ "map items: " ++ ( show $ M.size m )

testCScherer n = do
  l <- rgenBndStrRow (10,10) (10^n,10^n)  -- same limitations as yitz gale 
code.
  m <- return $ ordTable l 
  putStrLn $ "items: " ++ ( show $ length m )


-- slow for big lists
--freqArr = Prelude.map ( last &&& length ) . group . sort

-- yitz gale code. same as chad scherer code? it's simpler to understand, 
but is it as fast?
freqFold :: [[Char]] -> M.Map [Char] Int
freqFold = foldl' g M.empty
  where g accum x = M.insertWith' (+) x 1 accum
-- c scherer code. insists on ord. far as I can tell, same speed as yitz.
ordTable :: (Ord a) => [a] -> [(a,Int)]
ordTable xs = M.assocs $! foldl' f M.empty xs
    where f m x = let  m' = M.insertWith (+) x 1 m
                       Just v = M.lookup x m'
                  in v `seq` m'


l = ["egg","egg","cheese"]

-- other quickcheck stuff
--prop_unchanged_by_reverse = \l -> ( freqArr (l :: [[Char]]) ) == ( 
freqArr $ reverse l )
--prop_freqArr_eq_freqFold = \l -> ( freqArr (l :: [[Char]]) == (freqFold 
l))
--test1 = quickCheck prop_unchanged_by_reverse
--test2 = quickCheck prop_freqArr_eq_freqFold

--------------- generate test data: 
genBndStrRow (minCols,maxCols) (minStrLen, maxStrLen) = rgen ( genBndLoL 
(minStrLen, maxStrLen) (minCols,maxCols) )

gen gen = do
  sg <- newStdGen
  return $ generate 10000 sg gen

-- generator for a list with length between min and max
genBndList :: Arbitrary a => (Int, Int) -> Gen [a]
genBndList (min,max) = do
  len <- choose (min,max)
  vector len


-- lists of lists
--genBndLoL :: (Int, Int) -> (Int, Int) -> Gen [[a]]
genBndLoL (min1,max1) (min2,max2) = do
  len1 <- choose (min1,max1)
  len2 <- choose (min2,max2)
  vec2 len1 len2

--vec2 :: Arbitrary a => Int -> Int -> Gen [[a]]
vec2 n m = sequence [ vector m | i <- [1..n] ]





Chad Scherrer <chad.scherrer at gmail.com> 
Sent by: haskell-cafe-bounces at haskell.org
10/17/2007 01:35 PM

To
haskell-cafe at haskell.org
cc

Subject
[Haskell-cafe] Re: Suspected stupid Haskell Question






Big_Ham <joymachine2001 <at> hotmail.com> writes:

> 
> 
> Is there a library function to take a list of Strings and return a list 
of
> ints showing how many times each String occurs in the list.
> 
> So for example:
> 
> ["egg", "egg", "cheese"] would return [2,1]
> 
> I couldn't find anything on a search, or anything in the librarys.
> 
> Thanks BH.

Hi BH,

This might be overkill, but it works well for me. And it avoid stack 
overflows I
was originally getting for very large lists. Dean Herrington and I came up 
with
this:

ordTable :: (Ord a) => [a] -> [(a,Int)]
ordTable xs = Map.assocs $! foldl' f Map.empty xs
    where f m x = let  m' = Map.insertWith (+) x 1 m
                       Just v = Map.lookup x m'
                  in v `seq` m'

intTable :: [Int] -> [(Int,Int)]
intTable xs = IntMap.assocs $! foldl' f IntMap.empty xs
    where f m x = let  m' = IntMap.insertWith (+) x 1 m
                       Just v = IntMap.lookup x m'
                  in v `seq` m'

enumTable :: (Enum a) => [a] -> [(a,Int)]
enumTable = map fstToEnum . intTable . map fromEnum
    where fstToEnum (x,y) = (toEnum x, y)

If you like, it's easily wrapped in a Table class.

Chad




_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe at haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe



---

This e-mail may contain confidential and/or privileged information. If you 
are not the intended recipient (or have received this e-mail in error) 
please notify the sender immediately and destroy this e-mail. Any 
unauthorized copying, disclosure or distribution of the material in this 
e-mail is strictly forbidden.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20071017/d9a93c99/attachment-0001.htm


More information about the Haskell-Cafe mailing list