[Haskell-cafe] WideFinder

Sterling Clover s.clover at gmail.com
Fri Nov 9 23:26:07 EST 2007


Um... you do realize that the code is only supposed to match against  
very specific lines in sample data sets that Bray provides, right? If  
your access log doesn't have lines exactly like those (and why would  
it?) then there's no reason to expect a result.

--S

On Nov 9, 2007, at 11:19 PM, Berlin Brown wrote:

> Sterling Clover wrote:
>> I hacked together a version that I'm pretty happy with today.  
>> Started off trying an algorithm with channels and forking, then  
>> realized that in Haskell thanks to referential transparency we can  
>> get parallelism almost for free, and redid it all in  
>> Control.Parallel (below). Unfortunately, I don't have a multicore  
>> processor so I can't put this through any special paces. However,  
>> its compactness and expressively match or beat the simple Ruby,  
>> etc. scripts while it gets (theoretically) most of the parallel  
>> benefits of the enormous and unwieldy Erlang and JOcaml ones.
>>
>> --S
>>
>> module Main where
>> import qualified Data.ByteString.Lazy.Char8 as LB
>> import Data.List (foldl', unfoldr, insertBy)
>> import qualified Data.Map as M
>> import System.Environment (getArgs)
>> import Control.Parallel (par)
>> import Control.Parallel.Strategies (parMap, rwhnf)
>>
>> count :: M.Map LB.ByteString Int -> LB.ByteString -> M.Map  
>> LB.ByteString Int
>> count m line = if LB.pack "/ongoing/When" `LB.isPrefixOf` myLn  
>> then M.insertWith' (+) (LB.drop 14 myLn) 1 m else m
>> where myLn = (LB.takeWhile (/=' ') . LB.dropWhile (/='/') .  
>> LB.dropWhile (/='\"')) line
>>
>> mapUnionPar :: (Ord k, Num a) => [M.Map k a] -> M.Map k a
>> mapUnionPar m = head $ until (null . tail) mapUnionPar' m
>> where a |:| b = par a . par b $ a : b
>> mapUnionPar' (x:x':xs) = (M.unionWith (+) x x' |:| mapUnionPar' xs)
>> mapUnionPar' x = x
>>
>> newPar :: FilePath -> IO (M.Map LB.ByteString Int)
>> newPar = ((mapUnionPar . parMap rwhnf (foldl' count M.empty) .  
>> chunkify . LB.lines) `fmap`) . LB.readFile
>> where chunkify = unfoldr (\x -> if null x then Nothing else Just  
>> (splitAt 512 x))
>>
>> main = mapM_ ((print . fst . foldl' takeTop ([],[]) . M.toList  
>> =<<) . newPar) =<< getArgs
>> where takeTop ac@(bs,low) a = if null low || (snd . head) low <  
>> snd a then (splitAt 10 . insertBy ((. snd) . flip compare . snd)  
>> a) bs else ac
>>
>>
>> On Nov 7, 2007, at 9:06 AM, Bayley, Alistair wrote:
>>
>>>> From: haskell-cafe-bounces at haskell.org
>>>> [mailto:haskell-cafe-bounces at haskell.org] On Behalf Of manu
>>>>
>>>> Haskell is conspicuously absent from the languages used to  
>>>> tackle Tim
>>>
>>>> Bray's Wide Finder problem
>>>>
>>> (http://www.tbray.org/ongoing/When/200x/2007/10/30/WF-Results? 
>>> updated).
>>>> So far we have Ocaml, Erlang, Python, Ruby, etc...
>>>
>>> Tim Bray mentions that GHC won't build on Solaris, so presumably  
>>> that
>>> problem would need to be solved before Haskell appears in his  
>>> table. I
>>> see that there are Solaris binary packages:
>>> http://www.haskell.org/ghc/download_ghc_661.html#sparcsolaris
>>>
>>> so perhaps he just needs to be pointed to them?
>>>
>>> Alistair
>>> *****************************************************************
>>> Confidentiality Note: The information contained in this message,
>>> and any attachments, may contain confidential and/or privileged
>>> material. It is intended solely for the person(s) or entity to
>>> which it is addressed. Any review, retransmission, dissemination,
>>> or taking of any action in reliance upon this information by
>>> persons or entities other than the intended recipient(s) is
>>> prohibited. If you received this in error, please contact the
>>> sender and delete the material from any computer.
>>> *****************************************************************
>>> _______________________________________________
>>> Haskell-Cafe mailing list
>>> Haskell-Cafe at haskell.org
>>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
> You didn't do a whole lot when I tried to run it. I know I am being  
> mean, but that seems to be what Tim Bray is doing. He takes code  
> and if it doesnt work, he isn't spending 3 weeks to figure it out.
>
> So, I would just like to comment. I ran your code against an  
> access.log file and it gave me this:
>
> []
>
> ./a.out access.log
>
>



More information about the Haskell-Cafe mailing list