[Haskell-cafe] Speedy parsing

Tillmann Rendel rendel at rbg.informatik.tu-darmstadt.de
Thu Jul 19 20:47:52 EDT 2007


Re, Joseph (IT) wrote:
> At this point I'm out of ideas, so I was hoping someone could identify
> something stupid I've done (I'm still novice of FP in general, let alone
> for high performance) or direct me to a guide,website,paper,library, or
> some other form of help.

Two ideas about your aproaches:

(1) try to avoid explicit recursion by using some standard library 
functions instead. it's easier (once you learned the library) and may be 
faster (since the library may be written in a easy to optimize style).

(2) try lazy ByteStrings, they should be faster.

   http://www.cse.unsw.edu.au/~dons/fps.html

As an example, sorting of the individual lines of a csv files by key. 
csv parses the csv format, uncsv produces it. these functions can't 
handle '=' in the key or ',' in the key or value. treesort sorts by 
inserting stuff into a map and removing it in ascending order:

> import System.Environment
> import qualified Data.ByteString.Lazy.Char8 as B
> import qualified Data.Map as Map
> import Control.Arrow (second)
> 
> csv = (map $ map $ second B.tail . B.break (== '=')) . 
>       (map $ B.split ',') .
>       (B.split '\n')
> 
> uncsv = (B.join $ B.pack "\n") .
>         (map $ B.join $ B.pack ",") .
>         (map $ map $ \(key, val) -> B.concat [key, B.pack "=", val])
> 
> treesort = Map.toAscList . Map.fromList
> 
> main = B.interact $ uncsv . map treesort . csv

   Tillmann


More information about the Haskell-Cafe mailing list