<div dir="ltr">ok,</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Sep 12, 2013 at 10:55 PM, Geoffrey Mainland <span dir="ltr"><<a href="mailto:mainland@apeiron.net" target="_blank">mainland@apeiron.net</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The plan is as I wrote below:<br>
<div class="im"><br>
7.8 will only support passing 128-bit SIMD vectors in registers on x86-64.<br>
Other vectors sizes, and all vectors on x86-32, will be passed on the<br>
stack.<br>
<br>
</div>There is not enough time for anything else at his point.<br>
<br>
Geoff<br>
<div class="im"><br>
On 09/12/2013 10:40 PM, Carter Schonwald wrote:<br>
> let me know before the weekend starts.... so i can make time to help<br>
> if need be (unless Austin gives breathing room on merge window for<br>
> such a thing)<br>
><br>
><br>
> On Thu, Sep 12, 2013 at 3:03 PM, Carter Schonwald<br>
</div><div class="im">> <<a href="mailto:carter.schonwald@gmail.com">carter.schonwald@gmail.com</a> <mailto:<a href="mailto:carter.schonwald@gmail.com">carter.schonwald@gmail.com</a>>> wrote:<br>
><br>
> emphasis on "very very clear warning"<br>
><br>
><br>
> On Thu, Sep 12, 2013 at 3:00 PM, Carter Schonwald<br>
</div>> <<a href="mailto:carter.schonwald@gmail.com">carter.schonwald@gmail.com</a> <mailto:<a href="mailto:carter.schonwald@gmail.com">carter.schonwald@gmail.com</a>>><br>
<div class="im">> wrote:<br>
><br>
> after a bit more reflection: as long as we provide a clear<br>
> warning that 7.8 may at some point no longer work with llvm<br>
> 3.4, i'm down for the change. We just need to make it very<br>
> very clear, that it may stop working. (and have AVX support<br>
> via passing on the stack with <= 3.3)<br>
><br>
> before i go and upstream that patch, could we benchmark how<br>
> multivector perf fairs with patched llvm? i don't have the<br>
> right hardware for doing the benchmarks you did in your paper...<br>
><br>
> sorry for being a bit over the top yesterday, i'm just<br>
> juggling a lot right now :)<br>
><br>
> -Carter<br>
><br>
><br>
> On Thu, Sep 12, 2013 at 2:47 PM, Carter Schonwald<br>
> <<a href="mailto:carter.schonwald@gmail.com">carter.schonwald@gmail.com</a><br>
</div><div class="im">> <mailto:<a href="mailto:carter.schonwald@gmail.com">carter.schonwald@gmail.com</a>>> wrote:<br>
><br>
> oh, i didn't realize you had already done the work! (bah,<br>
> i'm sorry, i feel terrible)<br>
><br>
> I thought i had communicated ~ a month ago that I was<br>
> worried about release engineering interaction with making<br>
> it impossible to then make a subsequent changes more<br>
> thoughtfully because of the LLVM release cycle. This<br>
> concern of mine balloned a bit after helping triage a huge<br>
> number of problems people were hitting with the Clang<br>
> transition on mac thats underway.<br>
><br>
> Its actually very easy to package up an llvm with that<br>
> patch, much simpler than "build GHC from source". In fact,<br>
> on OS X, the simplest way to install LLVM by default<br>
> essentially does a build from source.<br>
><br>
> Geoff, it'd at least be worth running the benchmarks to<br>
> measure the work! (and as I said, i'm happy to help)<br>
><br>
><br>
> On Thu, Sep 12, 2013 at 2:30 PM, Geoffrey Mainland<br>
</div><div><div class="h5">> <<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>>> wrote:<br>
><br>
> If users have to do a custom llvm build, we might as<br>
> well ask them to<br>
> build ghc from source too.<br>
><br>
> Unless I misunderstood ticket #8033, you were<br>
> originally quite gung-ho<br>
> about changing the LLVM calling conventions to support<br>
> passing SIMD<br>
> vectors of all widths in registers on both x86-32 and<br>
> -64, getting these<br>
> patches into LLVM 3.4, and making sure that GHC 7.8<br>
> would support all<br>
> this. I spent several days making sure this could<br>
> happen from the GHC<br>
> side. Now that the plan has changed, I will back out<br>
> that work, and 7.8<br>
> will only support passing 128-bit SIMD vectors in<br>
> registers on x86-64.<br>
> Other vectors sizes, and all vectors on x86-32, will<br>
> be passed on the stack.<br>
><br>
> Geoff<br>
><br>
> On 9/12/13 1:32 PM, Carter Schonwald wrote:<br>
> > to repeat:<br>
> ><br>
> > I think no one would have object to having a<br>
> clearly marked,<br>
> > experimental -fllvmExpermentalAVX flag that requires<br>
> building LLVM<br>
> > with a specified patch, as a way to showcase your<br>
> multivector work!<br>
> ><br>
> > that would evade all of my objections (provided avx<br>
> is still exposed<br>
> > with normal -fllvm, but spilled to stack rather than<br>
> registers), and<br>
> > i'd actually argue in favor of such.<br>
> ><br>
> > Especially since it would not impose any release<br>
> cycle constraints on<br>
> > a subsequent, systematic exploration for using XMM /<br>
> YMM / ZMM in the<br>
> > calling convention going forward.<br>
> ><br>
> > @Geoff, Simons, Johan, and others: does anyone<br>
> object to that approach?<br>
> ><br>
> > applying such a calling convention patch to llvm is<br>
> really quite<br>
> > straightforward, and the build process is pretty<br>
> zippy after that too.<br>
> ><br>
> > cheers<br>
> > -Carter<br>
> ><br>
> ><br>
> > On Thu, Sep 12, 2013 at 2:34 AM, Carter Schonwald<br>
> > <<a href="mailto:carter.schonwald@gmail.com">carter.schonwald@gmail.com</a><br>
> <mailto:<a href="mailto:carter.schonwald@gmail.com">carter.schonwald@gmail.com</a>><br>
</div></div>> <mailto:<a href="mailto:carter.schonwald@gmail.com">carter.schonwald@gmail.com</a><br>
<div><div class="h5">> <mailto:<a href="mailto:carter.schonwald@gmail.com">carter.schonwald@gmail.com</a>>>> wrote:<br>
> ><br>
> > that said it does occur to me that there is an<br>
> alternative<br>
> > solution that may be acceptable for everyone!<br>
> ><br>
> > what about providing a pseudo compatible way called<br>
> > -fllvm-experimentalAVX (or something), and<br>
> simply require that for<br>
> > it to be used, the user has an llvm Patched with<br>
> the YMM simd in<br>
> > register fun call support? internally that could<br>
> just be an llvm<br>
> > way that trips the logic that puts the first few<br>
> AVX values in<br>
> > those YMM1-6 slots if they are the first args,<br>
> so only the stack<br>
> > spilling logic needs be changed?<br>
> ><br>
> > (ie it wouldn't be tied to an llvm version, but<br>
> rather this pseduo<br>
> > way flag)<br>
> ><br>
> > does that make sense?<br>
> ><br>
> > either way, i'd really like having avx even if<br>
> its always spilled<br>
> > to stack at funcalls with standard LLVMs!<br>
> ><br>
> > cheers<br>
> > -carter<br>
> ><br>
> ><br>
> ><br>
> ><br>
> > On Thu, Sep 12, 2013 at 2:28 AM, Carter Schonwald<br>
> > <<a href="mailto:carter.schonwald@gmail.com">carter.schonwald@gmail.com</a><br>
> <mailto:<a href="mailto:carter.schonwald@gmail.com">carter.schonwald@gmail.com</a>><br>
</div></div>> <mailto:<a href="mailto:carter.schonwald@gmail.com">carter.schonwald@gmail.com</a><br>
<div class="HOEnZb"><div class="h5">> <mailto:<a href="mailto:carter.schonwald@gmail.com">carter.schonwald@gmail.com</a>>>><br>
> > wrote:<br>
> ><br>
> > Geoff,<br>
> ><br>
> > a prosaic reason why there *might* be a<br>
> fundamentally breaking<br>
> > change would be the following idea nathan<br>
> howell suggested to<br>
> > me this afternoon: change the Sp and SPLim<br>
> register so that<br>
> > the X86/x86_64 target can use the CPU's Push<br>
> and (maybe) Pop<br>
> > instructions for the stack manipulations,<br>
> rather than MOV and<br>
> > fam. see<br>
> <a href="http://ghc.haskell.org/trac/ghc/ticket/8272" target="_blank">http://ghc.haskell.org/trac/ghc/ticket/8272</a> (which<br>
> > is just what i've said). Thats one change<br>
> thats pretty simple<br>
> > but deep, but likely worth exploring.<br>
> ><br>
> ><br>
> > i'm saying any ABI change for GHC 7.10,<br>
> would likely entail<br>
> > patching LLVM 3.4, because thats the only<br>
> LLVM version likely<br>
> > to come out between now and whenever we get<br>
> 7.10 out (assuming<br>
> > 7.10 lands within the next 8-12 months,<br>
> which is reasonable<br>
> > since we've got noticeably more (amazing)<br>
> people helping out<br>
> > lately). Thus, any change there entails<br>
> either asking the llvm<br>
> > folks to support >1 GHC convention per<br>
> architecture, or<br>
> > replace the current one! I'd rather do the<br>
> latter than the<br>
> > former, when it comes to asking other people<br>
> to maintain it :)<br>
> > (and llvm engineers do in fact help out<br>
> maintaining that code)<br>
> ><br>
> ><br>
> > have you run a Nofib, or even benchmarks<br>
> restricted to your<br>
> > multivector code, for the current calling<br>
> convention<br>
> > (including the spilling AVX vectors to the<br>
> stack thats the<br>
> > current plan i gather) VS passing in<br>
> registers with an LLVM<br>
> > built using the patches i worked out ~ 2<br>
> months ago? it'd be<br>
> > really easy to build that custom llvm, then<br>
> run the<br>
> > benchmarks! (i'm happy to help, and<br>
> ultimately, benchmarks<br>
> > will reveal if its worth while or not! And<br>
> if the main goal is<br>
> > for your talk, its still valid even if its<br>
> not in the merge<br>
> > window over the next 4 days).<br>
> ><br>
> > I really think its not obvious what the<br>
> "best" abi<br>
> > change would be! It really will require<br>
> coming up with a list<br>
> > of variants, implementing them, and running<br>
> nofib with each<br>
> > variant, which i lack the compute/human time<br>
> resources to do<br>
> > this week. Modern hardware is complex enough<br>
> that for<br>
> > something like an ABI change, the only<br>
> healthy attitude can be<br>
> > "lets benchmark it!".<br>
> ><br>
> > i'd really like any change in calling<br>
> convention to also<br>
> > improve perf on codes that aren't explicitly<br>
> simd! (and a<br>
> > conservative simd only change,<br>
> blocks/conflicts with that<br>
> > augmentation going forward, and not just for<br>
> the stack pointer<br>
> > example i mention early)<br>
> ><br>
> > Not just scalar floats in simd registers ,<br>
> but perhaps also<br>
> > words/ints !<br>
> ><br>
> > (though that latter bit might be pretty<br>
> ambitious and subtle,<br>
> > i'll need to investigate that a bit to see<br>
> how feasible it may<br>
> > be).<br>
> > SIMD has great support for ints/words, and<br>
> any partial abi<br>
> > change on the llvm backend now would make it<br>
> hard to support<br>
> > that later well (or at least, thats what it<br>
> looks like to me).<br>
> > actually effectively using simd for scalar<br>
> ints and words<br>
> > should be doable, but might force us to be a<br>
> bit more<br>
> > thoughtful on how GHC internally<br>
> distinguishes ints used for<br>
> > address arithmetic, vs ints used as data.<br>
> (interestingly, i'm<br>
> > not sure if any current extent x86 calling<br>
> convention does that!)<br>
> ><br>
> ><br>
> > That single change would make 7.10<br>
> require a completely<br>
> > different llvm and native code gen<br>
> convention from our current<br>
> > one, plus touch all of the code gen on x86<br>
> architectures.<br>
> ><br>
> ><br>
> > basically: we're lucky that everyone builds<br>
> haskell code from<br>
> > source, so ABI compat across GHC versions is<br>
> a non issue. BUT,<br>
> > any ABI changes should be backed by<br>
> benchmarks (at least when<br>
> > the change is performance motivated).<br>
> Likewise, because we use<br>
> > LLVM as an external dep for the -fllvm<br>
> backend, we really need<br>
> > to keep how their release cycle interacts<br>
> with our release<br>
> > cycle, because people use haskell and ghc!<br>
> which as many like<br>
> > to say, is both a boon and a pain ;).<br>
> ><br>
> > Having people hit ghc acting broken with an<br>
> llvm that was<br>
> > "supported before" is risky support problem<br>
> to deal with.<br>
> > having an LLVM head variant support a<br>
> modified ABI, and then<br>
> > later needing to break it for 7.10 (for one<br>
> of the possible<br>
> > exploratory reasons above) would lead to a<br>
> support headache I<br>
> > don't wish on anyone.<br>
> ><br>
> > pardon the verbose answer, but thats my<br>
> offhand take<br>
> ><br>
> > cheers<br>
> > -Carter<br>
> ><br>
> ><br>
> > On Wed, Sep 11, 2013 at 10:10 PM, Geoffrey<br>
> Mainland<br>
> > <<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>><br>
</div></div><div class="HOEnZb"><div class="h5">> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>>>> wrote:<br>
> ><br>
> > We support compiling some code with<br>
> -fllvm and some not in<br>
> > the same<br>
> > executable. Otherwise how could users of<br>
> the Haskell<br>
> > Platform link their<br>
> > -fllvm-compiled code with<br>
> native-codegen-compiled<br>
> > libraries like base, etc.?<br>
> ><br>
> > In other words, the LLVM and native back<br>
> ends use the same<br>
> > calling<br>
> > convention. With my SIMD work, they<br>
> still use the same calling<br>
> > conventions, but the native codegen can<br>
> never generate<br>
> > code that uses<br>
> > SIMD instructions.<br>
> ><br>
> > Geoff<br>
> ><br>
> > On 09/11/2013 10:03 PM, Johan Tibell wrote:<br>
> > > OK. But that doesn't create a problem<br>
> for the code we<br>
> > output with the<br>
> > > LLVM backend, no? Or do we support<br>
> compiling some code<br>
> > with -fllvm and<br>
> > > some not in the same executable?<br>
> > ><br>
> > ><br>
> > > On Wed, Sep 11, 2013 at 6:56 PM,<br>
> Geoffrey Mainland<br>
> > > <<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>>><br>
</div></div><div class="HOEnZb"><div class="h5">> > <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>><br>
> > <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>>>>> wrote:<br>
> > ><br>
> > > We definitely have interop between<br>
> the native<br>
> > codegen and the LLVM<br>
> > > back<br>
> > > end now. Otherwise anyone who<br>
> wanted to use the LLVM<br>
> > back end<br>
> > > would have<br>
> > > to build GHC themselves. Interop<br>
> means that users<br>
> > can install the<br>
> > > Haskell Platform and still use<br>
> -fllvm when it makes<br>
> > a performance<br>
> > > difference.<br>
> > ><br>
> > > Geoff<br>
> > ><br>
> > > On 09/11/2013 07:59 PM, Johan<br>
> Tibell wrote:<br>
> > > > Do nothing different than you're<br>
> doing for 7.8, we<br>
> > can sort it out<br>
> > > > later. Just put a comment on the<br>
> primops saying<br>
> > they're<br>
> > > LLVM-only. See<br>
> > > > e.g.<br>
> > > ><br>
> > > ><br>
> > > ><br>
> > ><br>
> ><br>
> <a href="https://github.com/ghc/ghc/blob/master/compiler/prelude/primops.txt.pp#L181" target="_blank">https://github.com/ghc/ghc/blob/master/compiler/prelude/primops.txt.pp#L181</a><br>
> > > ><br>
> > > > for an example how to add docs<br>
> to primops.<br>
> > > ><br>
> > > > I don't think we need interop<br>
> between the native<br>
> > and the LLVM<br>
> > > > backends. We don't have that now<br>
> do we (i.e. they<br>
> > use different<br>
> > > > calling conventions).<br>
> > > ><br>
> > > ><br>
> > > ><br>
> > > > On Wed, Sep 11, 2013 at 4:51 PM,<br>
> Geoffrey Mainland<br>
> > > > <<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>><br>
> > <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>>><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>><br>
> > <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>>>><br>
> > > <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>><br>
> > <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>>><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>><br>
> > <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>>>>>> wrote:<br>
> > > ><br>
> > > > On 09/11/2013 07:44 PM,<br>
> Johan Tibell wrote:<br>
> > > > > On Wed, Sep 11, 2013 at<br>
> 4:40 PM, Geoffrey<br>
> > Mainland<br>
> > > > <<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>><br>
> > <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>>><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>><br>
> > <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>>>><br>
> > > <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>><br>
> > <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>>><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>><br>
> > <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a><br>
> <mailto:<a href="mailto:mainland@apeiron.net">mainland@apeiron.net</a>>>>>> wrote:<br>
> > > > > > Do you mean we need a<br>
> reasonable emulation<br>
> > of the SIMD<br>
> > > primops for<br>
> > > > > > the native codegen?<br>
> > > > ><br>
> > > > > Yes. Reasonable in the<br>
> sense that it<br>
> > computes the right<br>
> > > result.<br>
> > > > I can<br>
> > > > > see that some code might<br>
> still want to<br>
> > #ifdef (if the<br>
> > > fallback isn't<br>
> > > > > fast enough).<br>
> > > ><br>
> > > > Two implications of this<br>
> requirement:<br>
> > > ><br>
> > > > 1) There will not be SIMD in<br>
> 7.8. I just don't<br>
> > have the<br>
> > > time. In fact,<br>
> > > > what SIMD support is there<br>
> already will have<br>
> > to be removed if we<br>
> > > > cannot<br>
> > > > live with LLVM-only SIMD<br>
> primops.<br>
> > > ><br>
> > > > 2) If we also require<br>
> interop between the LLVM<br>
> > back-end and<br>
> > > the native<br>
> > > > codegen, then we cannot pass<br>
> any SIMD vectors in<br>
> > > registers---they all<br>
> > > > must be passed on the stack.<br>
> > > ><br>
> > > > My plan, as discussed with<br>
> Simon PJ, is to not<br>
> > support SIMD<br>
> > > primops at<br>
> > > > all with the native codegen.<br>
> If there is a<br>
> > strong feeling that<br>
> > > > this *is<br>
> > > > not* the way to go, the I<br>
> need to know ASAP.<br>
> > > ><br>
> > > > Geoff<br>
> > > ><br>
> > > ><br>
> > > ><br>
> > ><br>
> > ><br>
> ><br>
> ><br>
> ><br>
> ><br>
><br>
><br>
><br>
><br>
><br>
<br>
</div></div></blockquote></div><br></div>