[C2hs] making c2hs undrstand line pragmas

Duncan Coutts duncan.coutts at worc.ox.ac.uk
Sat Jan 21 18:56:48 EST 2006


I've got a patch to c2hs to make it do something with C style line
pragmas in .chs files, eg:

# 1 "gtk/Graphics/UI/Gtk/TreeList/TreeStore.chs.pp"

These are produced by the C preprocessor. Currently c2hs chokes on these
and so people have to use the -P option to cpp to suppress them. They
are actually rather useful if the code in fact does need preprocessing
(as most of gtk2hs's .chs files do) because they point to the original
file name and source locations. For example it means that ghc's errors
will report accurate locations in the .chs.pp file rather than reporting
locations in the .chs file.

c2hs already produces accurate Haskell line pragmas {-# LINE ... #-} in
the .hs files it produces for exactly that reason. I want to extend that
to the case that the .chs file itself has had a preprocessor used on it.

One other reason to preserve the original file name is that haddock can
now include links to the source files, and it uses the line pragmas to
find the original source file. It doesn't do much good however if
haddock links to a non-existant .chs file when the real original file
was .chs.pp.

So here's an example; the Gtk2Hs docs with source code links to our
darcs repository:

For example the link at the top of this page:
points to:

which is right. Of course, without the patch it'd point to the
non-existant file Widget.chs.

The way my patch works is to make the lexer recognise the line
directives and update the current position. However to get the line
directives emitted correctly we also have to insert a special token into
the token stream. When it comes to finally printing the token stream,
this special token puts the printer into a state in which it will add a
Haskell {-# LINE ... #-} pragma before the next Haskell source fragment.

The only thing that's wrong is that c2hs doesn't recognise cpp
directives as the first line in a .chs file. You can see why this is so
from the code below:

cpp :: CHSLexer
cpp = directive
        directive = 
	  string "\n#" +> alt ('\t':inlineSet)`star` epsilon
	     \(_:_:dir) pos s ->	-- strip off the "\n#"
	       case dir of

		... etc

It's requires a cpp directive to start with a newline followed by a '#'

I'm not sufficiently familiar with the style of c2hs's chs lexer to
figure out how to fix this. Perhaps it can be done by checking if we're
at the beginning of a line in a different way. Perhaps it can be done by
checking the current column rather than looking for a '\n' character.

So in summary:

Manuel, so if you don't complain about how this change works I'll commit
it in the next few days.

And my my question is how to fix this issue with recognising cpp
directives on the first line.


More information about the C2hs mailing list