[Haskell] Speed of ByteString.Lazy

Robby Findler robby at cs.uchicago.edu
Thu Jun 29 14:32:02 EDT 2006


Just out of curiosity, did you try "wc -l"?

Robby

On Jun 29, 2006, at 1:18 PM, Chad Scherrer wrote:

> I have a bunch of data files where each line represents a data  
> point. It's nice to be able to quickly tell how many data points I  
> have. I had been using wc, like this:
>
> % cat *.txt | /usr/bin/time wc
> 2350570 4701140 49149973
> 5.81user 0.03system 0:06.08elapsed 95%CPU (0avgtext+0avgdata  
> 0maxresident)k
> 0inputs+0outputs (152major+18minor)pagefaults 0swaps
>
> I only really care about the line count and the time it takes. For  
> larger data sets, I was getting tired of waiting for wc, and I  
> wondered whether ByteString.Lazy could help me do better. So I  
> wrote a 2-liner:
>
> import qualified Data.ByteString.Lazy.Char8 as L
> main = L.getContents >>= print . L.count '\n'
>
> ... and compiled this as "lc". It doesn't get much simpler than  
> that. How does it perform?
>
> % cat *.txt | /usr/bin/time lc
> 2350570
> 0.09user 0.13system 0:00.24elapsed 89%CPU (0avgtext+0avgdata  
> 0maxresident)k
> 0inputs+0outputs (199major+211minor)pagefaults 0swaps
>
> Wow. 64 times as fast for this run, with almost no effort on my  
> part. Granted, wc is doing more work, but the number of words and  
> characters aren't interesting to me in this case, anyway. I can't  
> imagine (implementation time)*(execution time) being much shorter.  
> Thanks, Don!
>
> -- 
>
> Chad Scherrer
>
> "Time flies like an arrow; fruit flies like a banana" -- Groucho Marx
> _______________________________________________
> Haskell mailing list
> Haskell at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell



More information about the Haskell mailing list