[Haskell-cafe] debugging memory corruption

Sun Dec 2 10:02:25 CET 2012

Thanks for the response.

On Sat, Dec 1, 2012 at 5:23 PM, Alexander Kjeldaas
<alexander.kjeldaas at gmail.com> wrote:
>
> What I've mostly done in similar circumstances (jni)
>
> 1. Create an interface (virtual functions or template) for the FFI in C++
> that covers everything you use. Then create one test implementation and one
> real implementation. The test implementation must allocate resources
> whenever the real FFI does so. Doing memory allocation works. This makes it
> possible to test all your FFI in C++ using valgrind.

If I understand correctly, this sounds like what I was talking about,
i.e. to stub out the C++ side and drive that from haskell to try to
repro.  That way I don't have to have windows popping up and do the
simulation at the level of mouse clicks.  The danger is that it turns
out to be lots of work to implement, but still somehow doesn't
reproduce the problem.  That could happen if the bug is in C++, but
only turns up during manual manipulation.

Or maybe you're talking about the other way around, stub out the
haskell and replace it with C++ and then run that in valgrind?  That
seems unlikely to be helpful, because if the bug is in the haskell FFI
code then rewriting that all in C++ is just going to replace it with
possibly also buggy C++ code.

It seems to me like valgrind just plain doesn't work for haskell,
maybe because the ghc runtime uses its own allocator?  So if the bug
is in haskell I can't find it with valgrind.  If the bug is in C++,
well, I already have a pure C++ version (that talks to the C++
interface in a very simplistic way), and it can run under valgrind,
which doesn't turn up any out of bounds errors.

> 2. Add tracing support to the real implementation and replay support to the
> test implementation.

I'm not sure this would work, since the whole thing is that the bug is
nondeterministic.  I feel like the only way to get it to come out is
to do a bunch of random stuff for a period of time.  It's likely that
whether it happens or not depends on the memory layout for that
particular run, and as far as I know you can't make that consistent.
Or can you?

> 3. Upload to Hackage.

Is the suggestion that people who love debugging hard problems will
swarm out of the woodwork and help me find the problem?  I should be
so lucky :)