debugging segfaults; any suggestions?

Adam Megacz megacz at cs.berkeley.edu
Thu Apr 21 22:11:01 CEST 2011


Simon Peyton-Jones <simonpj at microsoft.com> writes:
> Do you mean that GHC itself is seg-faulting, or that a program it
> compiles is seg-faulting?

It's coming from GHC itself, after I have added in some extra code.
That extra code is what's using unsafeCoerce in a [purportedly]
disciplined way.

> Could you binary-chop your way to the offending code, by wrapping
> some, but not all, sub-terms in the (pseudo)-trace call?  Ideally to
> the point where removing any single wrap will make it break.

This sounds really promising, especially because I took the time to
learn my way around the OCaml code that generates the Haskell code.  But
it's not immediately clear why the NOINLINE-trace wrapper works; it
might be that there's no correlation between which subterms must be
wrapped to avoid the segfault and which subterm is using unsafeCoerce
improperly.

Is there some way I can compile the program with strict (like ML)
evaluation?  Then I could have the wrappers maintain a secret "call
stack" (using unsafePerformIO); the wrapper would push/pop the line
number, and when the segfault happens I could just look at that stack.
The program is a total functional program, so I know for sure that its
termination does not rely on lazy evaluation.


[less encouraging stuff below]

> Speaking of which, try a very large allocation area, so that no GC
> happens at all.

Yeah, I did that (the "terminal beep on garbage collection" is way
cool!).  Segfaults even when no GC ever happens :(

> There are some RTS flags to control when thread-switching takes place;
> you could try reducing that frequency too.

Is ghc itself multi-threaded?  I didn't try any of this because I
assumed that no multithreading would be happening.


> Are you ever coercing between unboxed types, like from Int# to Int?

Well, as far as I know I'm not using unboxed types, in the following
sense: the hash-mark # doesn't appear in my source code anywhere.


> There are other RTS sanity-checking

Yeah, I tried "+RTS -Ds -RTS", no luck.


> Try switching off strictness analysis (-fno-strictness).

Yeah, tried that too, and -fno-cse, and -fno-full-laziness.

  - a




More information about the Cvs-ghc mailing list