[Haskell-cafe] Fwd: Bug in Parsec.Token

Derek Elkins derek.a.elkins at gmail.com
Mon Aug 2 14:33:22 EDT 2010


This is a forward of a message from March 4th.


---------- Forwarded message ----------
From: Derek Elkins <derek.a.elkins at gmail.com>
Date: Thu, Mar 4, 2010 at 9:43 PM
Subject: Re: Bug in Parsec.Token
To: Don Stewart <dons at galois.com>
Cc: Greg Fitzgerald <garious at gmail.com>, Antoine Latter
<aslatter at gmail.com>, "Sittampalam, Ganesh"
<ganesh.sittampalam at credit-suisse.com>, Ian Lynagh <igloo at earth.li>,
libraries at haskell.org


I'm not subscribed to libraries, so this won't go there.

One of the first benchmarks against Parsec 3.0.0 was John MacFarlane's
here: http://www.haskell.org/pipermail/haskell-cafe/2008-March/040258.html

In it, he found Parsec 3.0.0 about 2x slower for his benchmark.  I
can't recreate his benchmark, but I suspect it is a variant of one he
describes here:http://code.google.com/p/pandoc/wiki/Benchmarks

I decided to do a similar benchmark.  I used Parsec 2.1.0.1, Parsec
3.0.1, and Parsec 3.1.0.  Of particular note, building all three
required -only- changing which library pandoc depended on.  No change
to the source was necessary.  All tests in pandoc's test suite passed
for all versions.

Doing that benchmark with a different input file, this file
[http://wpcal.firetree.net/wp-content/plugins/PHP%20Markdown%201.0.1k/PHP%20Markdown%20Readme.text]
concatenated to itself 32 times to produce a 730KB markdown file, I
get the following times for the last three of four runs.

Parsec 2.1.0.1
derek at derek-laptop:~/temp/pandoc-1.3/dist/build/pandoc$ time
./pandoc-2.1.0.1 --strict t.text > /dev/null
real    0m9.863s
user    0m7.792s
sys     0m0.160s

real    0m9.756s
user    0m7.792s
sys     0m0.132s

real    0m10.123s
user    0m7.976s
sys     0m0.168s

Parsec 3.0.1
derek at derek-laptop:~/temp/pandoc-1.3/dist/build/pandoc$ time
./pandoc-3.0.1 --strict t.text > /dev/null
real    0m22.008s
user    0m17.445s
sys     0m0.324s

real    0m21.789s
user    0m17.433s
sys     0m0.160s

real    0m21.754s
user    0m17.677s
sys     0m0.168s

Parsec 3.1.0
derek at derek-laptop:~/temp/pandoc-1.3/dist/build/pandoc$ time
./pandoc-3.1.0 --strict t.text > /dev/null
real    0m10.708s
user    0m8.201s
sys     0m0.168s

real    0m11.078s
user    0m8.401s
sys     0m0.232s

real    0m10.797s
user    0m8.513s
sys     0m0.224s

These results recreate the approximate 2x slowdown that John
originally mentioned between Parsec 2.1.0.1 and Parsec 3.0.  It also
demonstrates that Parsec 3.1.0 is significantly faster than 3.0.1 but
still a little bit slower than Parsec 2.1.0.1.

On Thu, Mar 4, 2010 at 4:39 PM, Don Stewart <dons at galois.com> wrote:
> derek.a.elkins:
>> Who is going to maintain "Parsec 4"?
>>
>> I'm completely against this.  If people absolutely must have exactly
>> Parsec 2's implementation we can simply copy it into Parsec 3, and the
>> "compatibility" layer, in that case, will simply -be- Parsec 2.  I've
>> considered this as a temporary solution for the performance issues
>> just so people could move to Parsec 3 dependencies, but that should
>> not be necessary now, and even then I considered it a much less than
>> ideal solution.
>>
>> If the community wants to freeze on Parsec 2, then I have no problem
>> renaming the package, otherwise I think it is both unnecessary and a
>> waste of effort.
>>
>
> The problem is the ongoing lack of confidence in Parsec 3's performance.
> The new release goes some way to addressing this, but I think this has
> gone unaddressed for too long.
>
> Can someone address the lingering concern with benchmarks against parsec 2?
>
> -- Don
>


More information about the Haskell-Cafe mailing list