[Haskell-cafe] An interesting paper from Google

Fri Oct 15 17:43:39 EDT 2010

On Fri, Oct 15, 2010 at 09:28:09PM +0100, Andrew Coppin wrote:
>  http://k1024.org/~iusty/papers/icfp10-haskell-reagent.pdf
> 
> I'm sure some of you have seen this already. For those who lack the
> time or inclination to read through the (six) pages of this report,
> here's the summary...

Nice summary, I hope you found the paper interesting!

> We [i.e., the report authors] took a production Python system and
> rewrote bits of it in Haskell, some of which is now in production
> use. We conclude the following:
> 
> - Python lets you just do whatever the hell you want, but Haskell
> demands that you actually have a *plan* before you start churning
> out code and running it. The result is generally cleaner and more
> consistent when you get there.
> 
> - Haskell's much-criticised immutable data is actually an
> *advantage* for backtracking search problems.
> 
> - Haskell wins for thread-safety.
> 
> - ADTs are nicer than exceptions.
> 
> - The parts of Haskell stolen by Python aren't as nice in Python as
> they are in Haskell. [Well, duh.]

I'd say unfortunately, not just duh…

> - We like what GHC provides for profiling.
> 
> - We are dissappointed by what GHC provides for debugging.
> 
> - "String" is too slow. None of the alternatives seem to be widely
> supported. If your library consumes Strings and returns Strings, the
> fact that ByteString exists doesn't help you.
> 
> - Recent changes to GHC broke our code. (Specifically, extensible
> exceptions.) We were quite surprised that such a "stable" and
> "mature" system as GHC would do this to us.
> 
> - Haskell has quite a high barrier to entry. [Again, duh.]
> 
> The paper also contains an interesting section that basically says
> "we tried porting the Python implementing of XYZ into Haskell, but
> there wasn't really any advantage because it's all I/O". In my
> humble opinion, "it's all I/O" is a common beginner's mistake.
> Reading between the lines, it sounds like they wrote the whole thing
> in the IO monad, and then decided it looked just like the existing
> Python code so there wasn't much point in continuing.

Not quite (not all was in the I/O monad). It doesn't make sense to
rewrite 40K of lines from language A into language B just for fun. But
the advantages were not as strong as for the balancing algorithms to
justify any potential conversion. They were strong, just not strong
enough.

> I guess it's
> one of the bad habits that imperative programmers get into. With a
> little more experience, you eventually figure out that you can limit
> the stuff actually in the IO monad to a surprisingly small section,
> and so almost everything else in pure code, no matter how much the
> problem looks like it's "all I/O". But anyway, I'm only guessing
> from what I can actually see with my own eyes in the report itself.

That's not how I would describe it (if I had to write it in a single
paragraph).

Basically, if you take a random, numerical/algorithmic problem, and you
write it in FP/Haskell, it's easy to show to most non-FP programmers why
Haskell wins on many accounts. But if you take a heavy I/O problem
(networking code, etc.), while Haskell is as good as Python, it is less
easy to show the strengths of the language. Yes, all the nice bits are
still there, but when you marshall data between network and your
internal structures the type system is less useful than when you just
write algorithms that process the internal data. Similar with the other
nice parts.

Now, if I were to start from scratch… :)

> I'm surprised about the profiler. They seem really, really impressed
> with it. Which is interesting to me, since I can never seen to get
> anything sensible out of it. It always seems to claim that my
> program is spending 80% of its runtime executing zipWith or
> something equally absurd.

I'm surprised that you're surprised :) The profiler is indeed awesome,
and in general I can manage to get one factor of magnitude speedup on my
initial algorithms, if not more.

Even if it just tells me that zipWith is the slow part, that's enough.
I'd even say it's a very good hint where to start.

> I'm unsurprised which their
> dissappointment with debugging. I'm still quite surprised that
> there's no tool anywhere which will trivially print out the
> reduction sequence for executing an expression. You'd think this
> would be laughably easy, and yet nobody has done it yet.

Indeed.

Thanks again for the summary :)

regards,
iustin