[Haskell-beginners] IPerf in Haskell doesn't perform

Bob Ippolito bob at redivi.com
Thu Nov 7 22:55:17 UTC 2013


I took your code and made a few style cleanups so that hlint wouldn't
complain and so that it would be easier to read. Then I read through the
code and found where your code had a space leak (repeated in the client and
server code). I haven't put any effort into benchmarking, profiling, etc.,
just cleaning it up and fixing the algorithmic issues that I noticed.

Here's the current version: https://gist.github.com/etrepum/7362646
You can see the revisions here:
https://gist.github.com/etrepum/7362646/revisions

The space leak is a strictness issue that is common for Haskell beginners
to do. Before bothering to read any more of this message, I highly
recommend reading this section of Parallel and Concurrent Programming in
Haskell. It has great coverage on how Haskell's evaluation works:
http://chimera.labs.oreilly.com/books/1230000000929/ch02.html#sec_par-eval-whnf

Here's a minimal-ish example of what your code ends up doing:

sum :: [Int] -> Int
sum = sum' 0

sum' :: Int -> [Int] -> Int
sum' acc (x:xs) = sum' (acc + x) xs
sum' acc _ = acc

Why does this code have a space leak? Because `acc` is never forced. When
sum' recurses, the first argument is a thunk `x + acc` rather than the
value of that computation. This is a space leak because the value of `x +
acc` can be stored in constant space (an Int) but the nested thunks takes
up linear space (a thunk that references `x` and `acc` which is a thunk
itself except for the initial 0). The stack overflow happens because linear
stack space and computation is required to evaluate this thunk, which your
code does at the end when it prints the value. There are many ways to fix
this, one way without resorting to BangPatterns would be to use `seq` to
ensure that acc is evaluated before it is used.

sum :: [Int] -> Int
sum = sum' 0

sum' :: Int -> [Int] -> Int
sum' acc _ | seq acc False = undefined
sum' acc (x:xs) = sum' (acc + x) xs
sum' acc _ = acc

The BangPatterns method used in my cleanup is basically syntax sugar for
the above, but it is not currently part of the Haskell standard so a
LANGUAGE pragma is required to use it.

-bob


On Thu, Nov 7, 2013 at 7:40 AM, Thomas Bach <thbach at students.uni-mainz.de>wrote:

> Hi list,
>
> I had to do a little benchmarking of a network connection between two
> servers and I thought “Well, Haskell seemed pretty awesome, let's use it
> to accomplish the task.” So I spent a couple of days with the Haskell
> library and finally got the following (working!) code (see below). Now
> my questions:
>
> 1) Although being a beautiful language, my code seems to be ugly. Any
> hints to improve this? My problem is (I think) that I simply cannot get
> rid of the IO Monad and the only way I know how to deal with this is by
> using lots of 'do's which doesn't help at all to get rid of the monads.
> (Probably the imperative background problem…) :)
>
> 2) The program I wrote doesn't perform. It sended around 33 MBytes in 10
> seconds over a connection where iperf (the one packaged by Ubuntu,
> written in C) was able send 650 MBytes. Where can I improve performance?
> I'm especially interested in the low hanging fruits, which are easy to
> understand as a Haskell novice. ;)
>
> 3) When I run this program, I get the following Error (both, on the
> client, as well as on the server):
>
> $./iperf -c localhost
> "Running as client connecting to localhost"
> Stack space overflow: current size 8388608 bytes.
> Use `+RTS -Ksize -RTS' to increase it.
>
> enlarging the stack space avoids it, but how can I avoid it in the code?
> I guess that the stack space overflow comes from the recursive call to
> receiveData on the server and sendIntermediateAndFinalData in the
> client. I thought that in the way I implemented it, these functions are
> tailor recursive and therefor not prone to stack overflows!?! Anyway, I
> didn't check that, so maybe the error is somewhere else…
>
> Regards,
>         Thomas Bach.
>
> PS: This is the code pasted “As Quotation”. I couldn't find another way
> to get the code into Thunderbird w/o messing up line breaks and the
> like. I additionally added the code as an attachment. I hope this is ok
> – time to get a new MUA!
>
>
> > import Data.Time (DiffTime, utctDayTime, getCurrentTime)
> > import Foreign (ForeignPtr, Int64, Ptr, mallocForeignPtr, withForeignPtr)
> > import Foreign.Storable (sizeOf, peek, poke)
> > import GHC.IO.Handle (Handle, BufferMode(NoBuffering), hClose, hGetBuf,
> hSetBuffering, hPutBuf)
> > import Network (Socket, HostName, PortID(PortNumber), PortNumber,
> accept, connectTo, listenOn, withSocketsDo)
> > import System.Console.GetOpt (ArgOrder(Permute), ArgDescr(..),
> OptDescr(..), getOpt, usageInfo)
> > import System.Environment (getArgs)
> >
> > data Flag = Client  String | Server
> >           deriving Show
> >
> > -- Command line arguments for server and client mode.
> >
> > options :: [OptDescr Flag]
> > options =
> >   [ Option ['c'] ["client"] (ReqArg Client "HOST") "Connect to host as
> client."
> >   , Option ['s'] ["server"] (NoArg Server) "Run as server."
> >   ]
> >
> > perfOpts :: [String] -> IO [Flag]
> > perfOpts argv =
> >   case getOpt Permute options argv of
> >     ([], [], []) -> ioError (userError ("At least one Option is needed."
> ++ usageInfo "" options))
> >     (o, [], []) -> return o
> >     (_, _, errs) -> ioError (userError (concat errs ++ usageInfo ""
> options))
> >
> > -- Some constants
> >
> > port = PortNumber 8456
> > type Trans = Int64
> > numBytes = sizeOf (0 :: Trans)
> > inital = -128 :: Trans
> > intermediate = 0 :: Trans
> > final = 127 :: Trans
> >
> > makePtr = mallocForeignPtr :: IO (ForeignPtr Trans)
> >
> > -- The client connects to the server, initializes the connection
> > -- by sending initial, sends then for the amount of at least
> > -- 10 secs as much intermediate (0's) as possible and finally
> > -- sends final.
> >
> > runClient :: String -> IO ()
> > runClient host = do
> >   print ("Running as client connecting to " ++ host)
> >   hdl <- initClient host port
> >   ptr <- makePtr
> >   fillInitial ptr
> >   withForeignPtr ptr (sendBuf hdl)
> >   fillIntermediate ptr
> >   curTime <- fmap utctDayTime getCurrentTime
> >   (sentBytes, time) <- sendIntermediateAndFinal ptr hdl curTime 10 0
> >   print ("Sent " ++ show sentBytes ++ " Bytes in " ++ show time ++ "
> seconds.")
> >   hClose hdl
> >
> > initClient :: HostName -> PortID -> IO Handle
> > initClient host port = withSocketsDo $ do
> >   hdl <- connectTo host port
> >   hSetBuffering hdl NoBuffering
> >   return hdl
> >
> > sendIntermediateAndFinal :: ForeignPtr Trans -> Handle -> DiffTime ->
> DiffTime -> Int -> IO (Int, DiffTime)
> > sendIntermediateAndFinal ptr hdl start duration sent = do
> >   curTime <- fmap utctDayTime getCurrentTime
> >   if (curTime - start) > duration
> >     then do fillFinal ptr
> >             withForeignPtr ptr (sendBuf hdl)
> >             finishedTime <- fmap utctDayTime getCurrentTime
> >             return (sent + numBytes, finishedTime - start)
> >     else do withForeignPtr ptr (sendBuf hdl)
> >             sendIntermediateAndFinal ptr hdl start duration (sent +
> numBytes)
> >
> > sendBuf :: Handle -> Ptr Trans -> IO ()
> > sendBuf hdl buf = hPutBuf hdl buf numBytes
> >
> > fillPtr :: Trans -> ForeignPtr Trans -> IO ()
> > fillPtr num ptr = withForeignPtr ptr (\p -> poke p num)
> >
> > fillInitial = fillPtr inital
> > fillIntermediate = fillPtr intermediate
> > fillFinal = fillPtr final
> >
> > -- The server simply accepts connections, receives what it can get
> > -- and adds up the received bytes and transmission time.
> >
> > runServer :: IO ()
> > runServer = do
> >   print "Running as server."
> >   initServer port >>= handleConnection
> >
> > initServer :: PortID -> IO Socket
> > initServer port = withSocketsDo $ listenOn port
> >
> > handleConnection :: Socket -> IO ()
> > handleConnection socket = do
> >   (hdl, host, port) <- acceptConnection socket
> >   print ("Connection from " ++ host)
> >   ptr <- makePtr
> >   (num, bytes) <- receive ptr hdl
> >   curTime <- fmap utctDayTime getCurrentTime
> >   if num == inital
> >     then do (received, time) <- receiveData ptr hdl curTime 0
> >             print ("Received " ++ show received ++ " in " ++ show time
> ++ " seconds.")
> >             hClose hdl
> >             handleConnection socket
> >     else do print ("ERR: Expected " ++ show inital ++ " got " ++ show
> num)
> >             hClose hdl
> >             handleConnection socket
> >
> > acceptConnection :: Socket -> IO (Handle, HostName, PortNumber)
> > acceptConnection socket = do
> >   (hdl, host, port) <- accept socket
> >   hSetBuffering hdl NoBuffering
> >   return (hdl, host, port)
> >
> > receiveData :: ForeignPtr Trans -> Handle -> DiffTime -> Int -> IO (Int,
> DiffTime)
> > receiveData ptr hdl started received = do
> >   (num, bytes) <- receive ptr hdl
> >   if num /= final
> >      then receiveData ptr hdl started (received + bytes)
> >     else do curTime <- fmap utctDayTime getCurrentTime
> >             return (received + bytes, curTime - started)
> >
> > receive :: ForeignPtr Trans -> Handle -> IO (Trans, Int)
> > receive ptr hdl = do
> >   withForeignPtr ptr (\p -> hGetBuf hdl p numBytes)
> >   num <- withForeignPtr ptr peek
> >   return (num, numBytes)
> >
> > -- The main routine decides whether to run as server or client.
> >
> > main :: IO ()
> > main = do
> >   opts <- getArgs >>= perfOpts
> >   if length opts /= 1
> >      then ioError (userError ("Too many arguments!" ++ usageInfo ""
> options))
> >     else
> >     case head opts of
> >       Server -> runServer
> >       Client host -> runClient host
>
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/beginners/attachments/20131107/eb71f168/attachment-0001.html>


More information about the Beginners mailing list