debugging segfaults; any suggestions?
megacz at cs.berkeley.edu
Thu Apr 21 22:11:01 CEST 2011
Simon Peyton-Jones <simonpj at microsoft.com> writes:
> Do you mean that GHC itself is seg-faulting, or that a program it
> compiles is seg-faulting?
It's coming from GHC itself, after I have added in some extra code.
That extra code is what's using unsafeCoerce in a [purportedly]
> Could you binary-chop your way to the offending code, by wrapping
> some, but not all, sub-terms in the (pseudo)-trace call? Ideally to
> the point where removing any single wrap will make it break.
This sounds really promising, especially because I took the time to
learn my way around the OCaml code that generates the Haskell code. But
it's not immediately clear why the NOINLINE-trace wrapper works; it
might be that there's no correlation between which subterms must be
wrapped to avoid the segfault and which subterm is using unsafeCoerce
Is there some way I can compile the program with strict (like ML)
evaluation? Then I could have the wrappers maintain a secret "call
stack" (using unsafePerformIO); the wrapper would push/pop the line
number, and when the segfault happens I could just look at that stack.
The program is a total functional program, so I know for sure that its
termination does not rely on lazy evaluation.
[less encouraging stuff below]
> Speaking of which, try a very large allocation area, so that no GC
> happens at all.
Yeah, I did that (the "terminal beep on garbage collection" is way
cool!). Segfaults even when no GC ever happens :(
> There are some RTS flags to control when thread-switching takes place;
> you could try reducing that frequency too.
Is ghc itself multi-threaded? I didn't try any of this because I
assumed that no multithreading would be happening.
> Are you ever coercing between unboxed types, like from Int# to Int?
Well, as far as I know I'm not using unboxed types, in the following
sense: the hash-mark # doesn't appear in my source code anywhere.
> There are other RTS sanity-checking
Yeah, I tried "+RTS -Ds -RTS", no luck.
> Try switching off strictness analysis (-fno-strictness).
Yeah, tried that too, and -fno-cse, and -fno-full-laziness.
More information about the Cvs-ghc