GHC-specific alias analysis for the LLVM backend (and the build system)
as at hacks.yi.org
Fri Oct 7 12:37:09 CEST 2011
Sorry, just got back to this! I've updated my tree in that github
branch to refactor the build stuff into a macro. It needs more work
but it should make it much easier to add even more experimental and
On Thu, Oct 6, 2011 at 7:00 AM, Simon Marlow <marlowsd at gmail.com> wrote:
> On 04/10/2011 10:31, austin seipp wrote:
>> Okay, so I took a stab at the build system and getting the analysis
>> plugin at least building, and the results are in this commit:
>> I'm working on pushing a version that moves most of the logic in this
>> commit directly into a make macro called rules/llvm-plugin.mk so if we
>> ever wanted to write future plugins there isn't as much duplicated
>> logic. It makes the actual llvm/aa-plugin/ghc.mk file 2 lines of code.
>> I'm also interested in looking into LLVM passes etc for GHC-produced
>> code. My git branch somehow ended up corrupt though (my machine
>> crashed in the middle of working on it, but generally this shouldn't
>> be a problem unless it occurs during FS sync,) but I've backed up my
>> work and I'm cloning a new repository while looking...
>> Overall I found by sticking to
>> http://hackage.haskell.org/trac/ghc/wiki/Building/Architecture that
>> *most* of the basic stuff was actually not incredibly hard to
>> understand and modify, but I'm at a loss as to how to add new suffix
>> rules for .cpp files so far. Also some of that page is out of date
>> (e.g. in
>> there are no more PHONY targets, instead the full name merely includes
>> the directory path to build.) I'll take a look at updating it later.
> Please do update it, and feel free to run changes past the mailing list
> first if you like.
OK, I'll take a stab at it later once I get time and have finished
some more stuff up.
>> Some notes on the commit and food for thought, in no particular order,
>> if this is to be shipped to people. Also things that should be
>> * It depends on stage1 (or phase=1) finishing. Ideally it doesn't need
>> to do this, at least not at the moment. It would be nice if I could
>> say 'make all_llvm/aa-plugin' anywhere in the tree (like advertised,
>> and it does work!) and skip that build since it doesn't depend on GHC
>> being built at all.
> I don't quite follow this - why do you have a dependency on stage1, and why
> can't you remove it?
Sorry, I phrased that improperly. I should have asked a question: how
can I make the LLVM plugins *not* depend on the stage1 build? :) I
didn't quite see how to do this just from grunging around in other
places - I merely made the all_llvm/aa-plugin depend on the resulting
plugin file, and thus causing it to be built but I didn't know this
would invoke a stage1 dependency somehow.
>> * I use llvm-config to get the proper include dirs and library paths
>> as well as linker directives for the analysis pass. So configure.ac
>> now checks for it. Overall this is pretty nice because it does all the
>> hard work of knowing what we actually need to specify, and you can
>> just point the build system to a custom llvm-config if you ever want
>> to build with some other version etc. But it's needed to build the
>> * Following in these footsteps, should the compiler use llvm-config
>> determined by configure.ac to find the proper `opt` and `llc`
>> utilities to invoke? Right now it just assumes they're either A) in
>> the path or B) you specify them with -pgmlo and -pgmlc respectively.
>> If this is the case then maybe those last two flags should be removed
>> and consolidated with a flag -pgmllconf or something. Either way if
>> you don't do this and get the executables mismatched, I'll bet a
>> dollar the plugin fails to load anyway (and then the whole compilation
>> driver will fail.)
> We should be consistent about this, one way or another.
OK. Personally I'm of the opinion that if we're already going to use
`llvm-config` since we need the link lines etc, we should go ahead and
use it to find the proper bindir containing `opt` and `llc` too.
Options to the individual tools are fine, but really there's no reason
I can see you would ever want to specify two totally separate llc/opt
tools that weren't built together, since there could be bitcode
changes and any other number of things that could crop up.
Individual plugins also probably won't work against versions of LLVM
that aren't the same, but personally if you're going to point
llvm-config to a newer version, you're gonna have to recompile the
plugin anyway, which is kind of unavoidable.
>> * If you wanted to ship this to people in future versions of GHC, I'm
>> pretty sure you're going to need to move/replicate the build rule to
>> work at install time too, because ideally you want to build the
>> eventual shared object file using *their* libraries and installed
>> version of LLVM and its tools, so it can be used safely by their
>> toolchain. Unless we want to ship a version of the .so on every
>> platform built against LLVM 2.7 - 2.9 (we don't.) This also brings up
>> the point of recompiling the plugin using a custom version of LLVM -
>> perhaps there could be a build rule inside llvm/aa-plugin to recompile
>> using a specific llvm-config utility? This would also make testing the
>> plugin against newer LLVM versions much easier.
> This seems like a serious gotcha. On platforms like Linux where the GHC
> package (.deb or .rpm) can depend on a particular version of LLVM, we would
> be ok here, but it will cause problems for binary distributions. Maybe it
> would be possible to build the plugin at install time, since a binary
> distribution is essentially just a build tree in which we do 'make install'.
Yes, that's what I was thinking earlier. Package maintainers of GHC
will need to find out what LLVM they want to target if they want to
make it a dependency of GHC and take care of that, and that's their
problem, but overall for binary distributions I think that relegating
the plugin build to install time is probably the best way and most
straightforward way of shipping it. GHC can just bail on passing the
plugin to `opt` or whatnot if it's not there.
>> * Because of the implementation the patch only supports an analysis
>> pass stuffed inside a single .cpp file like the one Max wrote. This
>> should probably be lifted and I figure would be easy once we have .cpp
>> suffixes in the build system I guess. See below
>> * The patch probably has cruft/duplication, but I think it should make
>> sense. I'd appreciate directions on a better way to do things like not
>> building if there was no llvm-config found by configure.ac (other than
>> setting the BLDDEP dependency for all_* to the empty string
>> specifically) unless that's fine.
> Normally there would be a configuration setting exported by ./configure,
> something like HasLLVM=[YES|NO], and you use this to decide whether to build
> the component or not. Also it's useful for the summary at the end of the
> ./configure output to say which components will be built as a result of the
OK, I can make this change and clean it up a little. Right now as I'm
sure you saw it's specifically finding llvm-config, but a HasLLVM
variable would generally be more useful.
>> * I'm using a g++ floating off in space for creating the plugin. It
>> doesn't look like configure.ac detects the C++ compiler?
> Can't you use gcc?
I just tested this on Linux and it seems I can. ISTR being bitten by
using 'gcc' when compiling C++ in the past, so I tend to just "not do
that," but maybe I'm simply crazy and/or misinformed.
>> * I'm confused by adding suffixes. I looked at rules/c-suffix-rules.mk
>> and I'm at a loss for example as to how to structure rules for .cpp
>> files. I think it should be easy to add cases here, but I'm wondering
>> of the significance of things like:
>> $1/$2/build/%.$$($3_osuf) : $1/%.c $$(LAX_DEPS_FOLLOW)
>> $$($1_$2_HC_DEP) | $$$$(dir $$$$@)/.
>> "$$($1_$2_HC)" $$($1_$2_$3_GHC_CC_OPTS) -c $$< -o $$@
>> Now besides the fact this is where we begin getting into quad-dollar
>> signs, why is there a separate 'build' directory as part of the
>> target? Most of the macros take both the source directory and the
>> 'dist' directory which I believed was where object-files were put - is
>> the 'build' component typically supposed to represent a 'way'?
> By convention object files go into <dir>/dist/build. There are other things
> besides object files under <dir>/dist, e.g. docs go in <dir>/dist/doc. The
> $$$$(dir $$$$@) stuff is an idiom to do with creating the directories for
> the output files automatically:
> Object files for different "ways" do not currently go into separate
> directories, they are given different suffixes instead - that's what the
> $$($3_osuf) means: the object suffix for way $3.
OK, that makes a bit more sense. I guess I need to play around some
more on this since right now the build system just throws the .so
right next to the .cpp under llvm/aa-plugin, so presumably I'm doing
Another question since I'm at this point: where could I see how to
install the resulting object file in a specific directory with the
rest of GHC's libraries? Right now I'm saying:
INSTALL_LIBS += foo.so
in the llvm-plugin macro, so foo.so will get installed to <ghc
libdir>/foo.so - this isn't really a problem, but for clarity it'd be
nice to install it under something like <ghc libdir>/llvm/foo.so along
with any other plugins we may add. A small thing I know, but I'm
>> * Overall I found adding a new directory and getting most of the work
>> done pretty easy, actually! It's just starting to get hard to read now
>> that I'm at the point of adding macros... Would it be of interest to
>> anyone else to write something up about adding a new directory/etc? If
>> I don't write one up the above commit is about as simple as it gets I
>> think. Like I said some of the build system commentary pages could be
> I think it would be great to add a wiki page summarising the process of
> adding a new directory to the build, listing the files you have to modify,
> that sort of thing.
OK, I'll give that a stab as well.
>> On Mon, Oct 3, 2011 at 5:09 AM, Simon Marlow<marlowsd at gmail.com> wrote:
>>> On 30/09/2011 09:54, Max Bolingbroke wrote:
>>>> Hi GHCers,
>>>> As those of you who use the LLVM backend know, it often doesn't
>>>> optimise as aggressively as you would like. The reason for this is
>>>> often to do with aliasing. I've written a blog post outlining the
>>>> problem and a solution, in the form of a GHC-specific alias analyser
>>>> that tells LLVM that the heap does not alias with the stack:
>>>> I think it might be desirable to have this pass in GHC, so we can use
>>>> it whenever the user compiles with -fllvm. Perhaps someone could help
>>>> me make the build system do what I want, though? Basically, if the
>>>> LLVM backend is enabled I need to be able to compile a single C++ file
>>>> to a .dynlib/.so/.dll, linking it against the LLVM .dynlib. This
>>>> shared object must also be installed onto the user's system by "make
>>>> install", and the compiler must know the fully-qualified path to that
>>>> .dynlib so that I can have the driver pipeline invoke LLVM's "opt"
>>>> tool with the path from which to dynamically load the custom pass.
>>> This shouldn't be too hard, but you'll need to write some Makefile code
>>> make it happen - we don't have anything like this already in the build
>>>> My previous experiences with the build system have not been good, so
>>>> I'm a bit lost as to where to start with all this!
>>> I'd just make a subdirectory, put your .cpp file in it, and create a
>>> Makefile with rules to compile the file and install it. We should
>>> have a standard suffix rule for compiling .cpp files, so that things like
>>> SRC_CPLUSPLUS_OPTS can be used. Maybe you'll need some autoconf stuff to
>>> find the appropriate way to link against the LLVM libraries.
>>> Cvs-ghc mailing list
>>> Cvs-ghc at haskell.org
More information about the Cvs-ghc