[Haskell-cafe] poor performance when generating random text

Gregory Collins greg at gregorycollins.net
Wed Oct 17 09:36:32 CEST 2012


System.Random is very slow. Try the mwc-random package from Hackage.

On Wed, Oct 17, 2012 at 9:07 AM, Dmitry Vyal <akamaus at gmail.com> wrote:

> Hello anyone
>
> I've written a snippet which generates a file full of random strings. When
> compiled with -O2 on ghc-7.6, the generation speed is about 2Mb per second
> which is on par with interpreted php. That's the fact I find rather
> disappointing. Maybe I've missed something trivial? Any suggestions and
> explanations are welcome. :)
>
> % cat ext_sort.hs
> import qualified Data.Text as T
> import System.Random
> import Control.Exception
> import Control.Monad
>
> import System.IO
> import qualified Data.Text.IO as TI
>
> gen_string g = let (len, g') = randomR (50, 450) g
>                in T.unfoldrN len rand_text (len, g')
>  where rand_text (0,_) = Nothing
>        rand_text (k,g) = let (c, g') = randomR ('a','z') g
>                          in Just (c, ((k-1), g'))
>
> write_corpus file = bracket (openFile file WriteMode) hClose $ \h -> do
>   let size = 100000
>   sequence $ replicate size $ do
>     g <- newStdGen
>     let text = gen_string g
>     TI.hPutStrLn h text
>
> main = do
>   putStrLn "generating text corpus"
>   write_corpus "test.txt"
>
>
>
> % cat ext_sort.prof
>         Wed Oct 17 10:59 2012 Time and Allocation Profiling Report (Final)
>
>            ext_sort +RTS -p -RTS
>
>         total time  =       32.56 secs   (32558 ticks @ 1000 us, 1
> processor)
>         total alloc = 12,742,917,332 bytes  (excludes profiling overheads)
>
> COST CENTRE                MODULE  %time %alloc
>
> gen_string.rand_text.(...) Main     70.7   69.8
> gen_string                 Main     17.6   15.8
> gen_string.rand_text       Main      5.4   13.3
> write_corpus.\             Main      4.3    0.8
>
>
> individual     inherited
> COST CENTRE                       MODULE no.     entries  %time %alloc
> %time %alloc
>
> MAIN MAIN                                67           0    0.0    0.0
> 100.0  100.0
>  main                             Main 135           0    0.0    0.0
> 100.0  100.0
>   write_corpus                    Main 137           0    0.0    0.0
> 100.0  100.0
>    write_corpus.\                 Main 138           1    4.3    0.8
> 100.0  100.0
>     write_corpus.\.text           Main 140      100000    0.0    0.0
>  95.7   99.2
>      gen_string                   Main 141      100000   17.6   15.8
>  95.7   99.2
>       gen_string.g'               Main 147      100000    0.0    0.0
> 0.0    0.0
>       gen_string.rand_text        Main 144    25109743    5.4   13.3
>  77.5   83.2
>        gen_string.rand_text.g'    Main 148    24909743    0.6    0.0
> 0.6    0.0
>        gen_string.rand_text.(...) Main 146    25009743   70.7   69.8
>  70.7   69.8
>        gen_string.rand_text.c     Main 145    25009743    0.8    0.0
> 0.8    0.0
>       gen_string.len              Main 143      100000    0.0    0.0
> 0.0    0.0
>       gen_string.(...)            Main 142      100000    0.6    0.3
> 0.6    0.3
>
> ______________________________**_________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/**mailman/listinfo/haskell-cafe<http://www.haskell.org/mailman/listinfo/haskell-cafe>
>



-- 
Gregory Collins <greg at gregorycollins.net>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20121017/e0334dde/attachment.htm>


More information about the Haskell-Cafe mailing list