From jpm at cs.uu.nl Fri Aug 1 02:36:37 2008 From: jpm at cs.uu.nl (=?ISO-8859-1?Q?Jos=E9_Pedro_Magalh=E3es?=) Date: Fri Aug 1 02:36:32 2008 Subject: [Hs-Generics] Re: Owning SYB In-Reply-To: <52f14b210807312328y39ff1ee3m725f479853d06b28@mail.gmail.com> References: <00ca01c8eb33$ed05d120$75097ad5@cr3lt> <638ABD0A29C8884A91BC5FB5C349B1C32AE76B0E25@EA-EXMSG-C334.europe.corp.microsoft.com> <017701c8f0db$fc721c60$3d058351@cr3lt> <404396ef0807281450h435faeddv310f1ff984a8b0d7@mail.gmail.com> <035901c8f1db$34673510$71338351@cr3lt> <52f14b210807312328y39ff1ee3m725f479853d06b28@mail.gmail.com> Message-ID: <52f14b210807312336t7dd9f8f6p33b6118124b6fa5b@mail.gmail.com> Hello all, As Johan mentioned, here in Utrecht we are working on libraries for generic programming. We want to make it easier for people to use generic libraries, so we are packaging EMGM [1] and a library for generic programming for mutually recursive datatypes [2]. We intend to release these on Hackage soon (Summer vacations are delaying us a bit), along with useful generic applications (a zipper and a generic rewriting framework). Maintaining SYB fits well in this idea, and if no other natural maintainers volunteer, I (with some support from the other people at Utrecht) am happy to take it upon me. I probably won't do heavy development on the library, but including patches, and providing support is fine. We're also planning to maintain EMGM here in Utrecht, although we didn't develop that ourselves. Recently, (at least) Claus and Oleg have been posting interesting suggestions of improvements/modifications to SYB. Those should be further analyzed and discussed, and finally introduced (or not) in the library. The generic map for SYB, for instance, evolved from the "impossible to implement", through the "unsafe implementation", until the latest gmap2 as described by Oleg [3]. If further tests show this function behaves as expected, then it's clearly a good candidate for extending SYB. We should also rethink if other things previously deemed impossible remain so. Maintaining SYB, alongside with the other generic libraries, will require things such as: * Releasing packages in Hackage, properly documented with Haddock; * Updating such packages as necessary for new releases of GHC; * Writing examples of how to use the libraries (from a user perspective); * Writing testsuites, which are important for checking backwards compatibility of any changes; * Having an updated webpage linking to the library sources, documentation, possibly a bug tracker, etc. These are all things we plan to do for the libraries. Additionally, we could think of improving syb-with-class [4] in parallel with regular SYB. This is something to ask to its maintainer. Cheers, Pedro [1] http://books.google.com/books?id=OyY3ioMJRAsC&pg=PA199&sig=ACfU3U1nczeRAIjN9mc_vYnL1LnYAs70NA [2] http://www.cs.uu.nl/research/techreps/UU-CS-2008-019.html [3] http://www.haskell.org/pipermail/generics/2008-July/000362.html [4] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/syb-with-class -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080801/a1fd90f3/attachment.htm From wnoise at ofb.net Fri Aug 1 05:00:41 2008 From: wnoise at ofb.net (Aaron Denney) Date: Fri Aug 1 05:00:54 2008 Subject: Unordered map References: <404396ef0807291105p20adf730jdd776d3b82a97867@mail.gmail.com> <488F7CE5.7090300@gmail.com> Message-ID: On 2008-07-29, Andrew Wagner wrote: > Well, it wouldn't need to literally implement Ord. It would just need > to do the same operations as Data.Map does, except without the > efficiency you get from Data.Map because you can assume an Ord > instance. One way to do it is to simply internally store the data as > an ordered [(a,k)], for example. Not efficient, but you get the same > interface as Map. Yes, but then you need to check tha you have the right one, which means you need the context "Eq a". If you're going to make users write an equality function, making them write an ordering adds little effort, and a reasonable amount of gain. Usually. -- Aaron Denney -><- From waldmann at imn.htwk-leipzig.de Fri Aug 1 06:18:30 2008 From: waldmann at imn.htwk-leipzig.de (Johannes Waldmann) Date: Fri Aug 1 06:19:33 2008 Subject: Unordered map In-Reply-To: References: <404396ef0807291105p20adf730jdd776d3b82a97867@mail.gmail.com> <488F7CE5.7090300@gmail.com> Message-ID: <4892E2F6.1080002@imn.htwk-leipzig.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Aaron Denney wrote: > If you're going to make users write an > equality function, making them write an ordering adds little effort, and > a reasonable amount of gain. Usually. Then why is there a distinction between e.g. Map and SortedMap (and Set and SortedSet) in the Java libraries? Yes yes I know Haskell is not Java etc. but they must have given this some thought. (Of course them making everything an instance of Eq and Hash is a design error but that's not the point here.) The practical problem with Haskell's type class mechanism is that all instances (of Eq, Ord) are global, so if one library (Data.Map) requires them, then you're stuck with these instances for all of your program. Of course the same thing holds for Java's "implements Comparable<>" but they have local types. Well, Haskell has newtype, but that's (module-)global. Best regards, J.W. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkiS4vYACgkQDqiTJ5Q4dm+GAACfZ6sedaApVtDaErb/1A0IL650 e80AoKnIQQfvdOhBUgtct7WqkVxtg4ps =AV1r -----END PGP SIGNATURE----- From wnoise at ofb.net Fri Aug 1 07:13:34 2008 From: wnoise at ofb.net (Aaron Denney) Date: Fri Aug 1 07:13:36 2008 Subject: Unordered map References: <404396ef0807291105p20adf730jdd776d3b82a97867@mail.gmail.com> <488F7CE5.7090300@gmail.com> <4892E2F6.1080002@imn.htwk-leipzig.de> Message-ID: On 2008-08-01, Johannes Waldmann wrote: > Aaron Denney wrote: > >> If you're going to make users write an >> equality function, making them write an ordering adds little effort, and >> a reasonable amount of gain. Usually. > > Then why is there a distinction between e.g. > Map and SortedMap (and Set and SortedSet) in the Java libraries? > > Yes yes I know Haskell is not Java etc. but they must have given > this some thought. (Of course them making everything an instance of > Eq and Hash is a design error but that's not the point here.) You would have to ask the Java designers. > The practical problem with Haskell's type class mechanism > is that all instances (of Eq, Ord) are global, This is a feature, not a bug. It helps ensure that manipulations on maps will always use compatible functions. If, instead, you constructed maps by passing in a comparison function, what happens when you merge two maps? Which function gets used? Normally you would be able to re-use the structure of each map in combining them. But if the functions they used are different, than they have to be resorted according to the kept function. Or, you have an obscure, difficult to track down bug, rather than an error at compile time. > so if one library (Data.Map) requires them, > then you're stuck with these instances for all of your program. > Of course the same thing holds for Java's "implements Comparable<>" > but they have local types. Well, Haskell has newtype, > but that's (module-)global. They don't have local types. They have inner classes, and objects get passed. Newtypes do work. The newtype deriving extension works even better. If something has multiple ways of comparing, then the context in which they are used should be tagged differently. Haskell types are cheap, both in the compiler, and when typing, because it's nowhere near as verbose as Java. -- Aaron Denney -><- From waldmann at imn.htwk-leipzig.de Fri Aug 1 07:26:54 2008 From: waldmann at imn.htwk-leipzig.de (Johannes Waldmann) Date: Fri Aug 1 07:27:55 2008 Subject: Unordered map In-Reply-To: References: <404396ef0807291105p20adf730jdd776d3b82a97867@mail.gmail.com> <488F7CE5.7090300@gmail.com> <4892E2F6.1080002@imn.htwk-leipzig.de> Message-ID: <4892F2FE.5040508@imn.htwk-leipzig.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > They [Java] don't have local types. > They have inner classes, and objects get passed. The net effect is that I can make an inner class that implements some interface, and in the implementation I can refer to things defined in some enclosing scope (not just in the global scope). Sure, I can only refer to static things, but in Haskell everything is static, no? > Newtypes do work. I agree, to some extent. Using a newtype to simulate the above is like lambda-lifting: you have to add the information on the local things you want to use. The Java compiler does that for you. Best regards, J.W. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkiS8v4ACgkQDqiTJ5Q4dm+rvwCeN+R7aRh4EwBXIKlf3Mhc9nc5 wRAAoKc+SlkCkEaybN6jIBHDSTu8J/yC =8Ndx -----END PGP SIGNATURE----- From jmaessen at alum.mit.edu Fri Aug 1 08:19:58 2008 From: jmaessen at alum.mit.edu (Jan-Willem Maessen) Date: Fri Aug 1 08:19:55 2008 Subject: Unordered map In-Reply-To: <4892E2F6.1080002@imn.htwk-leipzig.de> References: <404396ef0807291105p20adf730jdd776d3b82a97867@mail.gmail.com> <488F7CE5.7090300@gmail.com> <4892E2F6.1080002@imn.htwk-leipzig.de> Message-ID: On Aug 1, 2008, at 6:18 AM, Johannes Waldmann wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Aaron Denney wrote: > >> If you're going to make users write an >> equality function, making them write an ordering adds little >> effort, and >> a reasonable amount of gain. Usually. > > Then why is there a distinction between e.g. > Map and SortedMap (and Set and SortedSet) in the Java libraries? > > Yes yes I know Haskell is not Java etc. but they must have given > this some thought. (Of course them making everything an instance of > Eq and Hash is a design error but that's not the point here.) Au contraire, it's *exactly* the point! Java uses the hash code to implement collections that only require equality and hashing, but no ordering. Haskell, as a functional language, instead prefers equality and ordering---because trees admit efficient pure update, whereas hash tables generally do not. Two different languages, two different approaches to implementing collections---one more imperative, the other more functional. Of course, if one is simply looking for INefficient collections that require only (Eq a), this probably doesn't matter. But in that case it's hard to do much better than [(a,b)] and lookup (it's possible to do better, using e.g. unboxed tuple arrays and seekrit mutation, but probably only worth it for stuff that's big enough that we should have been using a proper map anyway). -Jan-Willem Maessen From claus.reinke at talk21.com Fri Aug 1 10:02:02 2008 From: claus.reinke at talk21.com (Claus Reinke) Date: Fri Aug 1 10:02:06 2008 Subject: [Hs-Generics] Re: Owning SYB References: <00ca01c8eb33$ed05d120$75097ad5@cr3lt><638ABD0A29C8884A91BC5FB5C349B1C32AE76B0E25@EA-EXMSG-C334.europe.corp.microsoft.com><017701c8f0db$fc721c60$3d058351@cr3lt><404396ef0807281450h435faeddv310f1ff984a8b0d7@mail.gmail.com><035901c8f1db$34673510$71338351@cr3lt><52f14b210807312328y39ff1ee3m725f479853d06b28@mail.gmail.com> <52f14b210807312336t7dd9f8f6p33b6118124b6fa5b@mail.gmail.com> Message-ID: <02a201c8f3df$2da1e3a0$e6327ad5@cr3lt> Hi Pedro, and thanks for volunteering! I include a summary of where I'm at, for your information (and that of other interested readers;-) > Recently, (at least) Claus and Oleg have been posting interesting > suggestions of improvements/modifications to SYB. Those should be further > analyzed and discussed, and finally introduced (or not) in the library. The > generic map for SYB, for instance, evolved from the "impossible to > implement", through the "unsafe implementation", until the latest gmap2 as > described by Oleg [3]. If further tests show this function behaves as > expected, then it's clearly a good candidate for extending SYB. We should > also rethink if other things previously deemed impossible remain so. Further discussion welcome, of course!-) And it isn't just about getting fmap/traverse/.. from Data/Typeable, but also about reorganizing imports of Data instances, and providing alternatives of everywhere/everything that avoid traversals of irrelevant subterms. A current snapshot of my code can be found here: http://www.cs.kent.ac.uk/~cr3/tmp/syb/syb-utils-0.0.2008.7.31.tar.gz It currently contains: a) a suggested split for the Instances module, with alternatives to Data.Generics that either export only Standard instances, or no instances at all. On top of the existing Syb in base, these are somewhat tricky to use, because of the implicit re-exports of Data.Generics.Instances from other libraries http://www.haskell.org/pipermail/generics/2008-July/000371.html and the long-standing GHC "session accumulates instances" bug http://hackage.haskell.org/trac/ghc/ticket/2182 http://www.haskell.org/ghc/docs/latest/html/users_guide/bugs.html#bugs-ghc relevant modules: Data.Generics.Alt, Data.Generics.NoInstances, Data.Generics.Instances.Standard, Data.Generics.Instances.Dubious, examples\Examples.hs status: I think this change should go into base (perhaps renaming Data.Generics.Alt to Data.Generics.Standard, and deprecating Data.Generics and Data.Generics.Instances), for all the reasons discussed here recently, and because that GHC bug makes it near impossible to provide this change as an addon to base. Existing importers of Data.Generics or Data.Generics.Instances in core and extralibs should be redirected to one of the new modules. b) variants of everywhere/everything/mkQ/.. that retain the type domain of generically extended queries, build substructure type maps of types to be traversed, and use a slight generalisation of the Uniplate PlateData trick to avoid traversals of irrelevant substructures (usually but not always including, and not limited to, Strings) relevant modules: Data.Generics.GPS, examples\GPSbenchmark.hs, examples\CompanyDatatypes.hs status: This could be provided as an addon package. Performance of generic queries and transformation on the usual Paradise benchmark improve as expected; there is some overhead, which is visible in an alternative benchmark where there are no irrelevant substructures. The current code also tries to replace the linear search for applicable specific-type-queries with Map lookup, but here the overhead seems to outweigh the benefits, so I'll probably disable this code in future. One issue that I still need to address are nested types: just as Data.Generics.PlateData, Data.Generics.GPS currently fails for these (no finite representation of substructure types); current plan is to recognize nested types and to fall back to unoptimized traversals. c) generic default instance methods for Control.Monad.Functor and Data.Traversable relevant modules: Data.Generics.Utils, examples\Examples.hs status: under discusssion. could become part of that add-on package, or move into base. d) some Data/Typeable instances and utilities for GHC Api relevant modules: GHC.Syb.Instances, GHC.Syb.Instances0, GHC.Syb.Utils status: needs more testing, will probably be rendered obsolete when GHC Api starts providing those instances itself. > Maintaining SYB, alongside with the other generic libraries, will require > things such as: > * Releasing packages in Hackage, properly documented with Haddock; > * Updating such packages as necessary for new releases of GHC; > * Writing examples of how to use the libraries (from a user perspective); > * Writing testsuites, which are important for checking backwards > compatibility of any changes; > * Having an updated webpage linking to the library sources, documentation, > possibly a bug tracker, etc. > These are all things we plan to do for the libraries. > > Additionally, we could think of improving syb-with-class [4] in parallel > with regular SYB. This is something to ask to its maintainer. > > > Cheers, > Pedro > > [1] > http://books.google.com/books?id=OyY3ioMJRAsC&pg=PA199&sig=ACfU3U1nczeRAIjN9mc_vYnL1LnYAs70NA > [2] http://www.cs.uu.nl/research/techreps/UU-CS-2008-019.html > [3] http://www.haskell.org/pipermail/generics/2008-July/000362.html > [4] > http://hackage.haskell.org/cgi-bin/hackage-scripts/package/syb-with-class > -------------------------------------------------------------------------------- > _______________________________________________ > Generics mailing list > Generics@haskell.org > http://www.haskell.org/mailman/listinfo/generics > From claus.reinke at talk21.com Fri Aug 1 10:08:38 2008 From: claus.reinke at talk21.com (Claus Reinke) Date: Fri Aug 1 10:08:41 2008 Subject: [Hs-Generics] Re: Owning SYB References: <00ca01c8eb33$ed05d120$75097ad5@cr3lt><638ABD0A29C8884A91BC5FB5C349B1C32AE76B0E25@EA-EXMSG-C334.europe.corp.microsoft.com><017701c8f0db$fc721c60$3d058351@cr3lt><404396ef0807281450h435faeddv310f1ff984a8b0d7@mail.gmail.com><035901c8f1db$34673510$71338351@cr3lt><52f14b210807312328y39ff1ee3m725f479853d06b28@mail.gmail.com><52f14b210807312336t7dd9f8f6p33b6118124b6fa5b@mail.gmail.com> <02a201c8f3df$2da1e3a0$e6327ad5@cr3lt> Message-ID: <02a801c8f3e0$198946f0$e6327ad5@cr3lt> oops, keybinding malfunction - that message went out unfinished. Seems to have most parts, though, so I leave it at that. One other thing I meant to ask was about procedure, given that Syb is currently in base and hence under the library modification process. How is this going to combine with an active maintainer and some parts on hackage? Claus ----- Original Message ----- From: "Claus Reinke" To: "Jos? Pedro Magalh?es" ; ; Sent: Friday, August 01, 2008 3:02 PM Subject: Re: [Hs-Generics] Re: Owning SYB > Hi Pedro, > > and thanks for volunteering! I include a summary of where I'm at, for > your information (and that of other interested readers;-) > >> Recently, (at least) Claus and Oleg have been posting interesting >> suggestions of improvements/modifications to SYB. Those should be further >> analyzed and discussed, and finally introduced (or not) in the library. The >> generic map for SYB, for instance, evolved from the "impossible to >> implement", through the "unsafe implementation", until the latest gmap2 as >> described by Oleg [3]. If further tests show this function behaves as >> expected, then it's clearly a good candidate for extending SYB. We should >> also rethink if other things previously deemed impossible remain so. > > Further discussion welcome, of course!-) And it isn't just about getting > fmap/traverse/.. from Data/Typeable, but also about reorganizing imports > of Data instances, and providing alternatives of everywhere/everything > that avoid traversals of irrelevant subterms. > > A current snapshot of my code can be found here: > > http://www.cs.kent.ac.uk/~cr3/tmp/syb/syb-utils-0.0.2008.7.31.tar.gz > > It currently contains: > > a) a suggested split for the Instances module, with alternatives to Data.Generics that either > export only Standard instances, or no > instances at all. > On top of the existing Syb in base, these are somewhat tricky to use, > because of the implicit re-exports of Data.Generics.Instances from other libraries > > http://www.haskell.org/pipermail/generics/2008-July/000371.html > > and the long-standing GHC "session accumulates instances" bug > > http://hackage.haskell.org/trac/ghc/ticket/2182 > http://www.haskell.org/ghc/docs/latest/html/users_guide/bugs.html#bugs-ghc > > relevant modules: > Data.Generics.Alt, > Data.Generics.NoInstances, > Data.Generics.Instances.Standard, > Data.Generics.Instances.Dubious, > examples\Examples.hs > status: I think this change should go into base (perhaps renaming > Data.Generics.Alt to Data.Generics.Standard, and deprecating > Data.Generics and Data.Generics.Instances), for all the reasons > discussed here recently, and because that GHC bug makes it near > impossible to provide this change as an addon to base. Existing importers of Data.Generics > or Data.Generics.Instances in core > and extralibs should be redirected to one of the new modules. > > b) variants of everywhere/everything/mkQ/.. that retain the type domain > of generically extended queries, build substructure type maps of types > to be traversed, and use a slight generalisation of the Uniplate PlateData > trick to avoid traversals of irrelevant substructures (usually but not always > including, and not limited to, Strings) > > relevant modules: > Data.Generics.GPS, > examples\GPSbenchmark.hs, > examples\CompanyDatatypes.hs > > status: > This could be provided as an addon package. Performance > of generic queries and transformation on the usual Paradise > benchmark improve as expected; there is some overhead, > which is visible in an alternative benchmark where there are no irrelevant substructures. > > The current code also tries to replace the linear search for > applicable specific-type-queries with Map lookup, but here > the overhead seems to outweigh the benefits, so I'll probably > disable this code in future. > > One issue that I still need to address are nested types: just > as Data.Generics.PlateData, Data.Generics.GPS currently > fails for these (no finite representation of substructure types); > current plan is to recognize nested types and to fall back to > unoptimized traversals. > c) generic default instance methods for Control.Monad.Functor > and Data.Traversable > > relevant modules: > Data.Generics.Utils, > examples\Examples.hs > > status: > under discusssion. could become part of that add-on package, > or move into base. > > d) some Data/Typeable instances and utilities for GHC Api > > relevant modules: > GHC.Syb.Instances, > GHC.Syb.Instances0, > GHC.Syb.Utils > > status: > needs more testing, will probably be rendered obsolete when > GHC Api starts providing those instances itself. > >> Maintaining SYB, alongside with the other generic libraries, will require >> things such as: >> * Releasing packages in Hackage, properly documented with Haddock; >> * Updating such packages as necessary for new releases of GHC; >> * Writing examples of how to use the libraries (from a user perspective); >> * Writing testsuites, which are important for checking backwards >> compatibility of any changes; >> * Having an updated webpage linking to the library sources, documentation, >> possibly a bug tracker, etc. >> These are all things we plan to do for the libraries. >> >> Additionally, we could think of improving syb-with-class [4] in parallel >> with regular SYB. This is something to ask to its maintainer. >> >> >> Cheers, >> Pedro >> >> [1] >> http://books.google.com/books?id=OyY3ioMJRAsC&pg=PA199&sig=ACfU3U1nczeRAIjN9mc_vYnL1LnYAs70NA >> [2] http://www.cs.uu.nl/research/techreps/UU-CS-2008-019.html >> [3] http://www.haskell.org/pipermail/generics/2008-July/000362.html >> [4] >> http://hackage.haskell.org/cgi-bin/hackage-scripts/package/syb-with-class >> > > > -------------------------------------------------------------------------------- > > >> _______________________________________________ >> Generics mailing list >> Generics@haskell.org >> http://www.haskell.org/mailman/listinfo/generics >> > _______________________________________________ > Generics mailing list > Generics@haskell.org > http://www.haskell.org/mailman/listinfo/generics From alfonso.acosta at gmail.com Fri Aug 1 10:13:35 2008 From: alfonso.acosta at gmail.com (Alfonso Acosta) Date: Fri Aug 1 10:13:31 2008 Subject: runCommand/waitForProcess don't respect text printing order when stdout is redirected Message-ID: <6a7c66fc0808010713q1df4d9a3j95f492f563634376@mail.gmail.com> Hi all, I'm running ghc 6.8.2 in OSX and Linux. The following program behaves as expected when run on a terminal. == module Main where import System.Process main = sequence $ replicate 3 command where command = do putStrLn "foo" waitForProcess =<< runCommand "echo echo" putStrLn "bar" == $ ghc --make Main.hs -o main $./main foo echo bar foo echo bar foo echo bar However, when stdout is redirected to a file, the order is no longer respected: $ ./main > output $ cat output echo echo echo foo bar foo bar foo bar Am I missing something or should I file a bug report? Thanks in advance, Fons From ross at soi.city.ac.uk Fri Aug 1 10:16:04 2008 From: ross at soi.city.ac.uk (ross@soi.city.ac.uk) Date: Fri Aug 1 10:16:03 2008 Subject: proposal #2461: add Traversable generalizations?of?mapAccumL?and mapAccumR In-Reply-To: <1217533519-sup-8322@ausone.local> References: <20080722161248.GA5027@soi.city.ac.uk> <49a77b7a0807221849u2e40a238wb285e70d2cddaff3@mail.gmail.com> <49a77b7a0807272005t41443a9fhed2111d598d68b61@mail.gmail.com> <98E9D03A-EACC-46CB-BEC0-7D92D2868079@strictlypositive.org> <1217533519-sup-8322@ausone.local> Message-ID: <1217600164.48931aa440563@fred.soi.city.ac.uk> Quoting Nicolas Pouillard : > I've also understood "<**> = flip <*>" when reading the docs :( Clearly the docs need improving. Does anyone have any comments on the proposal itself? ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. From alfonso.acosta at gmail.com Fri Aug 1 10:20:45 2008 From: alfonso.acosta at gmail.com (Alfonso Acosta) Date: Fri Aug 1 10:20:42 2008 Subject: runCommand/waitForProcess don't respect text printing order when stdout is redirected In-Reply-To: <6a7c66fc0808010713q1df4d9a3j95f492f563634376@mail.gmail.com> References: <6a7c66fc0808010713q1df4d9a3j95f492f563634376@mail.gmail.com> Message-ID: <6a7c66fc0808010720s7ffaa9aeta1eefa98fd7f9314@mail.gmail.com> In case it helps, I have just confirmed the same behaviour under Windows. From Alistair.Bayley at invesco.com Fri Aug 1 10:25:31 2008 From: Alistair.Bayley at invesco.com (Bayley, Alistair) Date: Fri Aug 1 10:25:45 2008 Subject: runCommand/waitForProcess don't respect text printing order whenstdout is redirected In-Reply-To: <6a7c66fc0808010713q1df4d9a3j95f492f563634376@mail.gmail.com> References: <6a7c66fc0808010713q1df4d9a3j95f492f563634376@mail.gmail.com> Message-ID: <125EACD0CAE4D24ABDB4D148C4593DA9049E94F8@GBLONXMB02.corp.amvescap.net> > From: libraries-bounces@haskell.org > [mailto:libraries-bounces@haskell.org] On Behalf Of Alfonso Acosta > > The following program behaves as expected when run on a terminal. > > However, when stdout is redirected to a file, the order is no > longer respected: Possibly buffering? I think the terminal has line buffering by default, whereas files are usually block-buffered. Try changing the buffering to line: hSetBuffering stdout LineBuffering Alistair ***************************************************************** Confidentiality Note: The information contained in this message, and any attachments, may contain confidential and/or privileged material. It is intended solely for the person(s) or entity to which it is addressed. Any review, retransmission, dissemination, or taking of any action in reliance upon this information by persons or entities other than the intended recipient(s) is prohibited. If you received this in error, please contact the sender and delete the material from any computer. ***************************************************************** From alfonso.acosta at gmail.com Fri Aug 1 10:32:36 2008 From: alfonso.acosta at gmail.com (Alfonso Acosta) Date: Fri Aug 1 10:32:31 2008 Subject: runCommand/waitForProcess don't respect text printing order whenstdout is redirected In-Reply-To: <125EACD0CAE4D24ABDB4D148C4593DA9049E94F8@GBLONXMB02.corp.amvescap.net> References: <6a7c66fc0808010713q1df4d9a3j95f492f563634376@mail.gmail.com> <125EACD0CAE4D24ABDB4D148C4593DA9049E94F8@GBLONXMB02.corp.amvescap.net> Message-ID: <6a7c66fc0808010732t3d35a0afxcb56a3590799205b@mail.gmail.com> On Fri, Aug 1, 2008 at 4:25 PM, Bayley, Alistair wrote: > Possibly buffering? I think the terminal has line buffering by default, > whereas files are usually block-buffered. Try changing the buffering to > line: > > hSetBuffering stdout LineBuffering Thanks! That worked. Sorry for the noise. I assumend that files, just like terminals, where line-buffered. From waldmann at imn.htwk-leipzig.de Fri Aug 1 11:51:53 2008 From: waldmann at imn.htwk-leipzig.de (Johannes Waldmann) Date: Fri Aug 1 11:52:56 2008 Subject: runCommand/waitForProcess don't respect text printing order whenstdout is redirected In-Reply-To: <6a7c66fc0808010732t3d35a0afxcb56a3590799205b@mail.gmail.com> References: <6a7c66fc0808010713q1df4d9a3j95f492f563634376@mail.gmail.com> <125EACD0CAE4D24ABDB4D148C4593DA9049E94F8@GBLONXMB02.corp.amvescap.net> <6a7c66fc0808010732t3d35a0afxcb56a3590799205b@mail.gmail.com> Message-ID: <48933119.3020903@imn.htwk-leipzig.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >> hSetBuffering stdout LineBuffering I run into this from time to time as well. I think the ghc (runtime) runtime behaviour is confusing here, as it tries to be too clever. I don't think it should even try to find out whether it's writing to terminal or file. Best regards, J.W. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkiTMRkACgkQDqiTJ5Q4dm8MRACfev267AhVPuxkDtCNP6Wz0G/n CLMAn1RqRS2qRtlDnnzneF2M9kr2QoiR =pi+1 -----END PGP SIGNATURE----- From waldmann at imn.htwk-leipzig.de Fri Aug 1 12:11:53 2008 From: waldmann at imn.htwk-leipzig.de (Johannes Waldmann) Date: Fri Aug 1 12:13:26 2008 Subject: Unordered map In-Reply-To: References: <404396ef0807291105p20adf730jdd776d3b82a97867@mail.gmail.com> <488F7CE5.7090300@gmail.com> <4892E2F6.1080002@imn.htwk-leipzig.de> Message-ID: <489335C9.2000709@imn.htwk-leipzig.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > [...] Two different languages, two different > approaches to implementing collections---one more imperative, the other > more functional. It's "easier" to implement Hash than Ord, because for Ord, you need transitivity. while a hash function just needs to be a function. (Difficult in Java, impossible to get wrong in Haskell.) If you take a bad hash function, you're "only" hurting performance, not correctness. In fact I sometimes make a wrapper type sth. like Wrap { hash = hash x, contents = x } deriving ( Eq, Ord ) (hoping for the left-to-right comparison) and then use Data.Map. Best regards, J.W. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkiTNckACgkQDqiTJ5Q4dm+FXgCeNLHcnFHCB+Bq9xJyv+qY1UE+ P+4Anja7agfdrTpDHcN9GAT3hkzav7jc =Jv1z -----END PGP SIGNATURE----- From isaacdupree at charter.net Fri Aug 1 13:16:09 2008 From: isaacdupree at charter.net (Isaac Dupree) Date: Fri Aug 1 13:16:03 2008 Subject: proposal #2461: add Traversable generalizations of mapAccumL and mapAccumR In-Reply-To: <20080722161248.GA5027@soi.city.ac.uk> References: <20080722161248.GA5027@soi.city.ac.uk> Message-ID: <489344D9.6070203@charter.net> Ross Paterson wrote: > The proposal is to add the following functions to Data.Traversable, > mapAccumL :: Traversable t => (a -> b -> (a, c)) -> a -> t b -> (a, t c) > mapAccumR :: Traversable t => (a -> b -> (a, c)) -> a -> t b -> (a, t c) It is useful as an education effort so that people can see how/which functions can be naturally generalized -- or if converting code from using Lists. (Probably it's useful in its own right, but I haven't used Traversable recently enough to be able to report on that.) I often thought that even Data.List.mapAccum[LR] is just on the edge of deserving to be its own function seperate from foldl/foldr anyway... -Isaac From nicolas.pouillard at gmail.com Fri Aug 1 13:40:19 2008 From: nicolas.pouillard at gmail.com (Nicolas Pouillard) Date: Fri Aug 1 13:40:50 2008 Subject: proposal #2461: add Traversable generalizations?of?mapAccumL?and mapAccumR In-Reply-To: <1217600164.48931aa440563@fred.soi.city.ac.uk> References: <20080722161248.GA5027@soi.city.ac.uk> <49a77b7a0807221849u2e40a238wb285e70d2cddaff3@mail.gmail.com> <49a77b7a0807272005t41443a9fhed2111d598d68b61@mail.gmail.com> <98E9D03A-EACC-46CB-BEC0-7D92D2868079@strictlypositive.org> <1217533519-sup-8322@ausone.local> <1217600164.48931aa440563@fred.soi.city.ac.uk> Message-ID: <1217612350-sup-7563@ausone.local> Excerpts from Ross Paterson's message of Fri Aug 01 16:16:04 +0200 2008: > Quoting Nicolas Pouillard : > > I've also understood "<**> = flip <*>" when reading the docs :( > > Clearly the docs need improving. Does anyone have any comments on the > proposal itself? > I'm for the mapAccum{L,R} proposal, and the Backward, Applicative instance. -- Nicolas Pouillard aka Ertai -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 194 bytes Desc: not available Url : http://www.haskell.org/pipermail/libraries/attachments/20080801/db1a2710/signature.bin From david.maciver at gmail.com Fri Aug 1 18:14:31 2008 From: david.maciver at gmail.com (David MacIver) Date: Fri Aug 1 18:14:27 2008 Subject: Unordered map In-Reply-To: References: <404396ef0807291105p20adf730jdd776d3b82a97867@mail.gmail.com> <488F7CE5.7090300@gmail.com> <4892E2F6.1080002@imn.htwk-leipzig.de> Message-ID: On Fri, Aug 1, 2008 at 1:19 PM, Jan-Willem Maessen wrote: > > On Aug 1, 2008, at 6:18 AM, Johannes Waldmann wrote: > >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Aaron Denney wrote: >> >>> If you're going to make users write an >>> equality function, making them write an ordering adds little effort, and >>> a reasonable amount of gain. Usually. >> >> Then why is there a distinction between e.g. >> Map and SortedMap (and Set and SortedSet) in the Java libraries? >> >> Yes yes I know Haskell is not Java etc. but they must have given >> this some thought. (Of course them making everything an instance of >> Eq and Hash is a design error but that's not the point here.) > > Au contraire, it's *exactly* the point! Java uses the hash code to > implement collections that only require equality and hashing, but no > ordering. Haskell, as a functional language, instead prefers equality and > ordering---because trees admit efficient pure update, whereas hash tables > generally do not. Two different languages, two different approaches to You know, this isn't actually true. You can implement an immutable HashMap as newtype HashMap a b = HashMap (IntMap [(a, b)]) Where you basically store association lists of things with the same hash code as they values of the IntMap. This has fairly good merge and update properties, and performs quite reasonably. I haven't tried this with the Haskell implementations, but I've benchmarked my Scala implementation of the same idea and, while not as fast as a normal array based HashMap, it seems to be a significance performance improvement over the red black tree based maps I've compared it against and supports pretty comparable amounts of sharing. From duncan.coutts at worc.ox.ac.uk Sun Aug 3 15:11:02 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Sun Aug 3 15:10:45 2008 Subject: Where should cabal install to by default? Message-ID: <1217790662.7661.66.camel@localhost> Hi folks, cabal-install is supposed to make things simpler for users, especially new users. It's partly successful at that currently. Many users do not know ghc-pkg even exists for example (not that it was a design goal but it seems to be a consequence). On the other hand we're currently failing users by not making programs they install work by default. Currently cabal-install installs all binaries to ~/.cabal/bin by default. It does per-user installs by default and uses ~/.cabal as its prefix. Obviously ?~/.cabal/bin is not on the $PATH, so users who install say, yi or whatever find that typing yi at the prompt does not do anything even though they just installed it. That's a failure on our part. It should work, and if we cannot make it work by default then we need to tell users what they need to do to make it work. So I'd like to discuss what defaults we should use. For reference, these are the features that are implemented right now that we have to play with: * We can do per-user or global installs (affects which package db we use but it also changes the default prefix) * We can set any --prefix we like * We can use versioned binaries (ie adding -$version suffixes) * We can add symlinks to binaries in some other directory * We can use a commend like sudo to do the install phase ?We can control all these features in the ~/.cabal/config file. When cabal-install is first run it creates a default ?~/.cabal/config file. So the question is what default it should set and how we report failure cases to the user. ?We do not have to use the same defaults on every platform. If we can get away with it I think it's much nicer not to make it interactive. Here's a couple suggestions: For unix systems, do per-user installs to --prefix=~/.cabal but if ~/bin exists then add symlinks there. Or perhaps if ~/bin is not a convention on that unix platform (eg OSX) then do global installs by default to /usr/local and use sudo for the install phase (we know that OS X comes with sudo where as it may or may not on other unix systems). Duncan From isaacdupree at charter.net Sun Aug 3 15:44:25 2008 From: isaacdupree at charter.net (Isaac Dupree) Date: Sun Aug 3 15:44:24 2008 Subject: Where should cabal install to by default? In-Reply-To: <1217790662.7661.66.camel@localhost> References: <1217790662.7661.66.camel@localhost> Message-ID: <48960A99.7000200@charter.net> Duncan Coutts wrote: > ... > For unix systems, do per-user installs to --prefix=~/.cabal but if ~/bin > exists then add symlinks there. just a reminder for anyone: if a directory is not already in $PATH then it's annoying to change $PATH; older shell sessions won't be updated automatically with the new $PATH, at least. Makes it a little harder to give the user advice on what to do, in the cases in which we have to resort to merely giving advice. -Isaac From duncan.coutts at worc.ox.ac.uk Sun Aug 3 16:19:56 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Sun Aug 3 16:19:33 2008 Subject: Where should cabal install to by default? In-Reply-To: <48960A99.7000200@charter.net> References: <1217790662.7661.66.camel@localhost> <48960A99.7000200@charter.net> Message-ID: <1217794796.7661.73.camel@localhost> On Sun, 2008-08-03 at 15:44 -0400, Isaac Dupree wrote: > Duncan Coutts wrote: > > ... > > For unix systems, do per-user installs to --prefix=~/.cabal but if ~/bin > > exists then add symlinks there. > > just a reminder for anyone: if a directory is not already in > $PATH then it's annoying to change $PATH; older shell > sessions won't be updated automatically with the new $PATH, > at least. Makes it a little harder to give the user advice > on what to do, in the cases in which we have to resort to > merely giving advice. And many new users don't know what a path is or how to change one. I'd like to avoid new users seeing: $ cabal install xmonad cabal: this didn't do what you expected, to fix it change the foo setting in your bar file. It's not friendly. Actually at the moment it's worse though: ?$ cabal install xmonad [.. lots of build output .. ] $ xmonad bash: ?xmonad: command not found # user gives up, assuming cabal is borked Duncan From johan.tibell at gmail.com Sun Aug 3 19:54:36 2008 From: johan.tibell at gmail.com (Johan Tibell) Date: Sun Aug 3 19:54:24 2008 Subject: A fancier Get monad or two (a la binary and binary-strict) In-Reply-To: <4891A956.5000201@list.mightyreason.com> References: <488FC298.2080205@list.mightyreason.com> <1217402802.12754.342.camel@localhost> <90889fe70807300816i2dc072dcx49e3b3f826492eed@mail.gmail.com> <4890DE5B.5000609@list.mightyreason.com> <90889fe70807310033q536ac08cjaa2f7263b011413@mail.gmail.com> <4891A956.5000201@list.mightyreason.com> Message-ID: <90889fe70808031654l13318b9bpdd4d2aa7c0883f21@mail.gmail.com> On Thu, Jul 31, 2008 at 2:00 PM, Chris Kuklewicz wrote: > [terrific explanation of the problem] Thanks a lot for the explanation. It's perfectly clear to me now. I changed by implementation along the lines with the one you attached except I didn't use Maybe to wrap the error handler in the parse state but instead set the failure handler to `failed' instead of Nothing on commit. Cheers, Johan From chak at cse.unsw.edu.au Sun Aug 3 21:56:19 2008 From: chak at cse.unsw.edu.au (Manuel M T Chakravarty) Date: Sun Aug 3 21:56:10 2008 Subject: Where should cabal install to by default? In-Reply-To: <1217790662.7661.66.camel@localhost> References: <1217790662.7661.66.camel@localhost> Message-ID: Duncan Coutts: > cabal-install is supposed to make things simpler for users, especially > new users. It's partly successful at that currently. Many users do not > know ghc-pkg even exists for example (not that it was a design goal > but > it seems to be a consequence). On the other hand we're currently > failing > users by not making programs they install work by default. > > Currently cabal-install installs all binaries to ~/.cabal/bin by > default. It does per-user installs by default and uses ~/.cabal as its > prefix. > > Obviously ~/.cabal/bin is not on the $PATH, so users who install say, > yi or whatever find that typing yi at the prompt does not do anything > even though they just installed it. That's a failure on our part. It > should work, and if we cannot make it work by default then we need to > tell users what they need to do to make it work. > > > So I'd like to discuss what defaults we should use. > > For reference, these are the features that are implemented right now > that we have to play with: > * We can do per-user or global installs (affects which package db > we use but it also changes the default prefix) > * We can set any --prefix we like > * We can use versioned binaries (ie adding -$version suffixes) > * We can add symlinks to binaries in some other directory > * We can use a commend like sudo to do the install phase > > We can control all these features in the ~/.cabal/config file. > > When cabal-install is first run it creates a default ~/.cabal/config > file. So the question is what default it should set and how we report > failure cases to the user. We do not have to use the same defaults on > every platform. > > If we can get away with it I think it's much nicer not to make it > interactive. > > Here's a couple suggestions: > > For unix systems, do per-user installs to --prefix=~/.cabal but if ~/ > bin > exists then add symlinks there. > > Or perhaps if ~/bin is not a convention on that unix platform (eg OSX) > then do global installs by default to /usr/local and use sudo for the > install phase (we know that OS X comes with sudo where as it may or > may > not on other unix systems). Just some random remarks: * Hiding installed files in a . directory is very bad style IMHO. I think that should never happen. Independent of whether you install right into /usr/local/bin or whether you symlink or whatever. You might install files under /usr/local/lib/cabal and then symlink, but probably its nicer to installto /usr/local/lib/- and then symlink. * On OS X, its not generally appropriate to install into /usr/local either. Each user has ~/Applications and ~/Library directories that are usually used for per-user installs. Just generally using sudo and put binaries into /usr/local is also bad because not every user will have admin rights on the machine. * ~/.cabal is bad on Mac OS, too. Preferences ought to go into ~/ Library/Preferences/ * Versioning should be the default (and not optional). Manuel From allbery at ece.cmu.edu Sun Aug 3 22:50:29 2008 From: allbery at ece.cmu.edu (Brandon S. Allbery KF8NH) Date: Sun Aug 3 22:50:17 2008 Subject: Where should cabal install to by default? In-Reply-To: References: <1217790662.7661.66.camel@localhost> Message-ID: <0DF2656F-9631-4CD8-B16A-E0C8D88111AB@ece.cmu.edu> On 2008 Aug 3, at 21:56, Manuel M T Chakravarty wrote: > Duncan Coutts: >> Obviously ~/.cabal/bin is not on the $PATH, so users who install say, >> yi or whatever find that typing yi at the prompt does not do anything >> even though they just installed it. That's a failure on our part. It >> should work, and if we cannot make it work by default then we need to >> tell users what they need to do to make it work. Grins and giggles: check the user's $PATH for directories under $HOME and use the first one found? > * Hiding installed files in a . directory is very bad style IMHO. I > think that should never happen. Independent of whether you install > right into /usr/local/bin or whether you symlink Even if I tell it to? Also, you might want to avoid StarOffice (and perhaps OpenOffice still does it these days). > or whatever. You might install files under /usr/local/lib/cabal and > then symlink, but probably its nicer to installto /usr/local/lib/ > - and then symlink. You're missing a key point: *user installs should not require root*. /usr/local is used for global installs, not for per-user installs. > * On OS X, its not generally appropriate to install into /usr/local > either. Each user has ~/Applications and ~/Library directories that > are usually used for per-user installs. Just Those are valid only for OSX applications; Unixy stuff goes elsewhere. By convention Fink uses /sw, MacPorts uses /opt/local, and /usr/local is left for packages not owned by either; this makes it a good place to install Cabalized programs (globally). > * ~/.cabal is bad on Mac OS, too. Preferences ought to go into ~/ > Library/Preferences/ Again, not for Unixy stuff, only for full OSX applications and frameworks. Try "ls -a" sometime. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH From rl at cse.unsw.edu.au Mon Aug 4 02:44:54 2008 From: rl at cse.unsw.edu.au (Roman Leshchinskiy) Date: Mon Aug 4 02:44:55 2008 Subject: Where should cabal install to by default? In-Reply-To: <1217794796.7661.73.camel@localhost> References: <1217790662.7661.66.camel@localhost> <48960A99.7000200@charter.net> <1217794796.7661.73.camel@localhost> Message-ID: On 04/08/2008, at 06:19, Duncan Coutts wrote: > And many new users don't know what a path is or how to change one. > > I'd like to avoid new users seeing: > > $ cabal install xmonad > cabal: this didn't do what you expected, to fix it change the foo > setting in your bar file. > > It's not friendly. > > Actually at the moment it's worse though: > $ cabal install xmonad > [.. lots of build output .. ] > $ xmonad > bash: xmonad: command not found > # user gives up, assuming cabal is borked I don't think it's realistic to expect that things will magically work for people who don't know what they are doing. In fact, I can't imagine that there are a lot of people who would use cabal-install and not know about paths. IMO, the goal should rather be to make installation simple for those who do know what they are doing and also to prevent users from shooting themselves in the foot as much as possible. That said, I would prefer installs to be global by default and to go into the same directory/tree in which cabal-install itself lives. Also, I don't think cabal-install itself should somehow invoke sudo. IMO, local installs shouldn't have a default location; rather, the user would be required to specify one in the prefs file or when invoking cabal-install. Roman From mail at justinbogner.com Mon Aug 4 03:11:03 2008 From: mail at justinbogner.com (Justin Bogner) Date: Mon Aug 4 03:14:51 2008 Subject: Where should cabal install to by default? In-Reply-To: References: <1217790662.7661.66.camel@localhost> <48960A99.7000200@charter.net> <1217794796.7661.73.camel@localhost> Message-ID: Roman Leshchinskiy wrote: > On 04/08/2008, at 06:19, Duncan Coutts wrote: > >> And many new users don't know what a path is or how to change one. >> >> I'd like to avoid new users seeing: >> >> $ cabal install xmonad >> cabal: this didn't do what you expected, to fix it change the foo >> setting in your bar file. >> >> It's not friendly. >> >> Actually at the moment it's worse though: >> $ cabal install xmonad >> [.. lots of build output .. ] >> $ xmonad >> bash: xmonad: command not found >> # user gives up, assuming cabal is borked > > I don't think it's realistic to expect that things will magically work > for people who don't know what they are doing. In fact, I can't imagine > that there are a lot of people who would use cabal-install and not know > about paths. IMO, the goal should rather be to make installation simple > for those who do know what they are doing and also to prevent users from > shooting themselves in the foot as much as possible. > > That said, I would prefer installs to be global by default and to go > into the same directory/tree in which cabal-install itself lives. Also, > I don't think cabal-install itself should somehow invoke sudo. IMO, > local installs shouldn't have a default location; rather, the user would > be required to specify one in the prefs file or when invoking > cabal-install. > > Roman I agree with Roman. Programs should not invoke sudo for users (If I need root access, I want to ask for it), and the default should be global rather than guessing a local path. With these two together you get a much deserved permission error if you simply run the program without configuring it or explicitly using root, which is the same as most other install tools and very sane, in my opinion. _____ Justin Bogner From duncan.coutts at worc.ox.ac.uk Mon Aug 4 06:26:48 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Mon Aug 4 06:26:21 2008 Subject: Where should cabal install to by default? In-Reply-To: References: <1217790662.7661.66.camel@localhost> Message-ID: <1217845608.7661.100.camel@localhost> On Mon, 2008-08-04 at 11:56 +1000, Manuel M T Chakravarty wrote: > Just some random remarks: Thanks for the useful feedback on OSX conventions. I was unaware about several of them. > * Hiding installed files in a . directory is very bad style IMHO. I > think that should never happen. Just on OSX or on Unix too? On Linux I'm not sure where else we could put things that's per-user. People seem to object strenuously to programs automatically putting files in any dir other than ~/.progname (or one or two other .files established by specification or convention). > Independent of whether you install right into /usr/local/bin or > whether you symlink or whatever. You might install files > under /usr/local/lib/cabal and then symlink, but probably its nicer to > installto /usr/local/lib/- and then symlink. For global installs we use prefx=/usr/local at the moment, so that mean ?/usr/local/lib/- for libs and ?/usr/local/bin for binaries. I've just implemented the symlinking feature so we could change the default on global installs to use the version suffix on binaries and symlink back into ?/usr/local/bin. Or do you mean we should install binaries into ??/usr/local/lib(exec?)/- and then symlink into ??/usr/local/bin. That would also be reasonable I suppose, though then the versioned binaries are not on the path. > * On OS X, its not generally appropriate to install into /usr/local > either. Each user has ~/Applications and ~/Library directories that > are usually used for per-user installs. Just generally using sudo and > put binaries into /usr/local is also bad because not every user will > have admin rights on the machine. Oh, great. I didn't know OS X had a standard location for per-user installs. That's excellent. So is there a preferred layout for those dirs? I'm guessing there probably is. > * ~/.cabal is bad on Mac OS, too. Preferences ought to go into ~/ > Library/Preferences/ Ok. BTW, in that case we should probably fix System.Directory.getAppUserDataDirectory to follow the system convention on OSX. Currently it uses $HOME/.appname on all unix systems (and the Windows convention on Windows). If necessary we may want to split ?getAppUserDataDirectory into a variant for config and another for data. Since cabal-install is a program should it still be using ?~/Library/Preferences/ or is there are corresponding ???~/Applications/Preferences/ ? Where would be appropriate for cabal-install put its download cache and build logs? > * Versioning should be the default (and not optional). Well, it's always configurable ?(distros that allow only a single version of a program probably would not want it for example), but yes, I think it's a pretty sensible default configuration. We version libs of course, but up 'til now all binaries have been unversioned. We recently added support for arbitrary program prefixes and suffixes (which can include program $version var) and adding unversioned symlinks into some other dir. It's less clear what we'd do on windows if we want versioned binaries since there are no links. Duncan From apfelmus at quantentunnel.de Mon Aug 4 07:36:34 2008 From: apfelmus at quantentunnel.de (apfelmus) Date: Mon Aug 4 07:36:35 2008 Subject: Where should cabal install to by default? In-Reply-To: <1217845608.7661.100.camel@localhost> References: <1217790662.7661.66.camel@localhost> <1217845608.7661.100.camel@localhost> Message-ID: Duncan Coutts wrote: > Manuel M T Chakravarty wrote: >> >> * On OS X, its not generally appropriate to install into /usr/local >> either. Each user has ~/Applications and ~/Library directories that >> are usually used for per-user installs. Just generally using sudo and >> put binaries into /usr/local is also bad because not every user will >> have admin rights on the machine. > > Oh, great. I didn't know OS X had a standard location for per-user > installs. That's excellent. So is there a preferred layout for those > dirs? I'm guessing there probably is. ~/Applications seems to be new in OS X 10.5, I sure don't have it in 10.4. But it seems sensible nonetheless. (And it's documented here http://developer.apple.com/documentation/MacOSX/Conceptual/BPFileSystem/Articles/Domains.html#//apple_ref/doc/uid/20002281-101038-BCIFCABI ) Note however that the ~/Applications directory is not directly comparable to a UNIX /bin . That's because applications in Mac OS X have binaries, data and all other batteries included in one folder, the "application bundle". For instance, the Thunderbird application is really a folder "Thunderbird.app" and looks something like this ~/Applications/Thunderbird.app/Contents/Info.plist ~/Applications/Thunderbird.app/Contents/MacOS/thunderbird-bin -- binary ~/Applications/Thunderbird.app/Contents/MacOS/ ... ~/Applications/Thunderbird.app/Contents/PkgInfo ~/Applications/Thunderbird.app/Contents/Resources/thunderbird.icns -- icons ~/Applications/Thunderbird.app/Contents/en.lproj/ -- language specific ... etc Bare-bones UNIX binaries like cabal-install itself and those produced by cabal don't really fit that scheme. That's why Mac OS X has a standard UNIX-style tree in /usr/local . Not sure what to do with per-user cabal, I'd make a fresh UNIX-tree at ~/Applications/Cabal/ or even ~/Applications/Haskell/ so we have ~/Applications/Cabal/bin ~/Applications/Cabal/lib ~/Applications/Cabal/share etc. >> * ~/.cabal is bad on Mac OS, too. Preferences ought to go into ~/ >> Library/Preferences/ > > Ok. BTW, in that case we should probably fix > System.Directory.getAppUserDataDirectory to follow the system convention > on OSX. Currently it uses $HOME/.appname on all unix systems (and the > Windows convention on Windows). If necessary we may want to > split ?getAppUserDataDirectory into a variant for config and another for > data. System.Directory.getAppUserDataDirectory should return ./Progname.app/Contents/Resources for programs that are packaged as OS X application bundle. Of course, that doesn't apply to most UNIX software... However, configuration files are to go in ~/Library/Preferences/ which is sufficiently similar to /etc . > Since cabal-install is a program should it still be > using ?~/Library/Preferences/ or is there are > corresponding ???~/Applications/Preferences/ ? Where would be > appropriate for cabal-install put its download cache and build logs? The ~/Library/ folder is more like /usr/etc and /usr/share than /usr/lib , so it works for both executable and shared libraries. ~/Library/Preferences/ for configuration files. ~/Library/Application Support/ for application data that you don't need to run the application. Like examples files or clip arts. ~/Library/Caches/ for cache files. ~/Library/Logs/ for logs. Some documentation from Apple: http://developer.apple.com/documentation/MacOSX/Conceptual/BPFileSystem/Articles/WhereToPutFiles.html http://developer.apple.com/documentation/MacOSX/Conceptual/BPFileSystem/Articles/LibraryDirectory.html Regards, apfelmus From jpm at cs.uu.nl Mon Aug 4 07:49:26 2008 From: jpm at cs.uu.nl (=?ISO-8859-1?Q?Jos=E9_Pedro_Magalh=E3es?=) Date: Mon Aug 4 07:49:12 2008 Subject: Developing SYB, packaging issues Message-ID: <52f14b210808040449o4189511k1cbe6d430b6fb6da@mail.gmail.com> Hello all, This message focuses on a problem Claus mentioned before: > One other thing I meant to ask was about procedure, > given that Syb is currently in base and hence under the > library modification process. How is this going to combine > with an active maintainer and some parts on hackage? > > Claus To be able to further develop SYB (see [1] for history), it's probably best to develop it as a separate package, which people can install, upgrade, etc. This would mean the library could be updated independently of GHC, and new GHC releases could then use the most recent version of the package. But how do these things merge? Can/should SYB be moved out of the base package? And, if this happens, can the library being developed as a separate package still use the automatic deriving mechanism? I'm sending this message to libraries@haskell.org too because I guess this problem might have shown up here before. Cheers, Pedro [1] http://search.gmane.org/?query=%22Owning+SYB%22&group=gmane.comp.lang.haskell.libraries&sort=revdate -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080804/f8282e9b/attachment.htm From claus.reinke at talk21.com Mon Aug 4 11:27:15 2008 From: claus.reinke at talk21.com (Claus Reinke) Date: Mon Aug 4 11:27:03 2008 Subject: for old-time sakes:-) shouldn't deprecated libraries have DEPRECATED pragma? Message-ID: <01a101c8f646$944a2590$05347ad5@cr3lt> The haddocks for System.Time (suggestively located in old-time) state: This library is deprecated, please look at Data.Time in the time package instead. Yet, there is no DEPRECATED pragma for the module, so I actually missed all those other hints, and the improved Data.Time functionality, for a while at least. Claus From isaacdupree at charter.net Mon Aug 4 18:40:25 2008 From: isaacdupree at charter.net (Isaac Dupree) Date: Mon Aug 4 18:40:08 2008 Subject: Where should cabal install to by default? In-Reply-To: <1217845608.7661.100.camel@localhost> References: <1217790662.7661.66.camel@localhost> <1217845608.7661.100.camel@localhost> Message-ID: <48978559.5080803@charter.net> Duncan Coutts wrote: > On Linux I'm not sure where else we could > put things that's per-user. People seem to object strenuously to > programs automatically putting files in any dir other than ~/.progname > (or one or two other .files established by specification or convention). just a thought: If you're into this standard http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html , we could use $XDG_CACHE_HOME/cabal or $XDG_DATA_HOME/cabal by default (depending whether you think what Cabal installs is just a cache, or not) ( the default values to use for those if not defined are $HOME/.cache and $HOME/.local/share ). Various things, not all X-related, use that convention/standard; the search-paths ($XDG_DATA_DIRS, $XDG_CONFIG_DIRS) can come in handy to separate my own configuration from whatever's been thrown in dot-dirs too. -Isaac From jwlato at gmail.com Mon Aug 4 19:29:16 2008 From: jwlato at gmail.com (John Lato) Date: Mon Aug 4 19:29:01 2008 Subject: Libraries Digest, Vol 60, Issue 4 In-Reply-To: <20080804102624.B27863242AB@www.haskell.org> References: <20080804102624.B27863242AB@www.haskell.org> Message-ID: <9979e72e0808041629xa9ad831kf9988fbabf3d0771@mail.gmail.com> On Mon, Aug 4, 2008 at 5:26 AM, Duncan Coutts wrote: > > Oh, great. I didn't know OS X had a standard location for per-user > installs. That's excellent. So is there a preferred layout for those > dirs? I'm guessing there probably is. > >> * ~/.cabal is bad on Mac OS, too. Preferences ought to go into ~/ >> Library/Preferences/ ~/Library/Preferences should only be used if the .plist format is supported. In general, I agree with Brendan that unixy programs don't belong here, and I'm happy with .cabal. However, if you want to use the OSX dirs, I would think that most of the items in .cabal could be placed in ~/Library/Application Support/Cabal/. Unlike most other standard directories, this appears to be completely application-dependent. > > Ok. BTW, in that case we should probably fix > System.Directory.getAppUserDataDirectory to follow the system convention > on OSX. Currently it uses $HOME/.appname on all unix systems (and the > Windows convention on Windows). If necessary we may want to > split ?getAppUserDataDirectory into a variant for config and another for > data. > > Since cabal-install is a program should it still be > using ?~/Library/Preferences/ or is there are > corresponding ???~/Applications/Preferences/ ? Where would be > appropriate for cabal-install put its download cache and build logs? > All preferences are in ~/Library/Preferences/, in Apple's .plist format. There is no ~/Applications/Preferences/. Build logs should go in ~/Library/Logs/, either in a subdirectory or directly in this folder. There is a ~/Library/Caches/, but I believe that's reserved for the NSURL cache (accessed through Cocoa). > > It's less clear what we'd do on windows if we want versioned binaries > since there are no links. > I have personally ended up just copying the latest version of "appname-n.n.n" to "appname". Granted I've only done that on my own computer, never as part of an install process. John Lato From sedillard at ucdavis.edu Mon Aug 4 20:08:54 2008 From: sedillard at ucdavis.edu (Scott Dillard) Date: Mon Aug 4 20:08:39 2008 Subject: Proposal: add laziness to Data.Map / IntMap Message-ID: Hi, I found myself wanting a lazy version of Data.Map today, and by "lazy" I mean in the node-subtree pointers. I trust the library writers when they put strictness annotations in the fields of the tree nodes, so I'm wondering what the various implications of lazy subtrees are, beyond the obvious speed hit. Does this cause space leaks beyond what lazy value pointers can already cause? Can someone point me to some reading that discusses this? Anyway, I'm positing to libraries (rather than haskell-cafe) to gauge opinion about a possible rewrite of Data.Map and IntMap to remove strictness annotations (bangs) from the node constructors and move them into the functions (as seqs). "Rewrite" is maybe too strong of a word. "Significant patch" is more like it. It would involve only those functions that construct Bin values directly, which is not that many. Less than a days work, I think (yes that means I volunteer.) Semantics of everything remains unchanged, but it opens up the possibility for lazy versions of some functions. The most usefull result of this would be a lazy map (little m). Here's Data.Map.mapWithKey mapWithKey f Tip = Tip mapWithKey f (Bin sx kx x l r) = Bin sx kx (f kx x) (mapWithKey f l) (mapWithKey f r) De-banged and then restrictified, it would look like this mapWithKey f Tip = Tip mapWithKey f (Bin sx kx x l r) = seq l' $ seq r' $ Bin sx kx (f kx x) l' r' where l = (mapWithKey f l) r = (mapWithKey f r) Looking at the first version, clearly you see that when constructing a new map you should only have to pay for the sub trees that you actually use. If you perform only a handful of lookups then throw the new tree away, why build the whole thing? To further motivate, let me explain my use case. I have cyclical data structures (graphs, more or less) that I mutate frequently, so I store them in a Map, indexed by some Ord thing, lets say Int, so I'd have something like Map Int [Int] (but not that exactly, and nothing like Data.Graph). This is great for mutations because I can use backtracking, but for lookups it's a burden on both me and the cpu. So I memoize the thing into something like "data Node a = Node a [Node a]" I can do this memoization using Data.Map.mapWithKey, with the Nodes built therein referring back to the produced Map. But then, what if I only crawl a small portion of this cyclical network of Nodes? Why should I have to pay for the whole thing to be rebuilt? It defeats the purpose of the memoization, which is to amortize the cost of following edges in the mutable graph. The pro and con as I see it are: Pro - More flexible data structure Con - Code is more verbose (see Data.Tree.AVL) - Only a few (but important) functions can be made lazy To that last point, note that while mapWithKey can be made lazy for both Map and IntMap, only IntMap allows lazy filter and mapMaybe because it doesn't rebalance. But I'm wondering how much of the tree needs to be forced when rebalancing. Should be only O(log n), right? It also becomes important where the tree is sourced from. The source needs to produce the tree lazily. The regular definition of fromList (= foldr (uncurry insert) empty) admits no laziness, but maybe successive unions could if the sub-maps were nearly disjoint (a not-uncommon case I think.) Does anyone know if any benchmarking has been done to this end? Finally, I'll stress once more that the semantics of the functions currently exported would be unchanged. This would only allow new lazy versions, named something like mapWithKeyL or unionL. So what do you think? Too much for too little? Scott -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080804/aec2a723/attachment-0001.htm From dons at galois.com Mon Aug 4 20:18:56 2008 From: dons at galois.com (Don Stewart) Date: Mon Aug 4 20:18:40 2008 Subject: Proposal: add laziness to Data.Map / IntMap In-Reply-To: References: Message-ID: <20080805001856.GC14051@scytale.galois.com> sedillard: > Hi, > > I found myself wanting a lazy version of Data.Map today, and by "lazy" I > mean in the node-subtree pointers. I trust the library writers when they > put strictness annotations in the fields of the tree nodes, so I'm > wondering what the various implications of lazy subtrees are, beyond the > obvious speed hit. Does this cause space leaks beyond what lazy value > pointers can already cause? Can someone point me to some reading that > discusses this? > > Anyway, I'm positing to libraries (rather than haskell-cafe) to gauge > opinion about a possible rewrite of Data.Map and IntMap to remove > strictness annotations (bangs) from the node constructors and move them > into the functions (as seqs). "Rewrite" is maybe too strong of a word. > "Significant patch" is more like it. It would involve only those functions > that construct Bin values directly, which is not that many. Less than a > days work, I think (yes that means I volunteer.) Semantics of everything > remains unchanged, but it opens up the possibility for lazy versions of > some functions. How about doing it as a separate library, then we can choose either strict or lazy as the case may be? -- Don From ajb at spamcop.net Mon Aug 4 20:50:14 2008 From: ajb at spamcop.net (ajb@spamcop.net) Date: Mon Aug 4 20:49:58 2008 Subject: Proposal: add laziness to Data.Map / IntMap In-Reply-To: References: Message-ID: <20080804205014.xb2bh2maokcw4sww-nwo@webmail.spamcop.net> G'day all. Quoting Scott Dillard : > I found myself wanting a lazy version of Data.Map today, and by "lazy" I > mean in the node-subtree pointers. Right. Just to be clear, to start off with: - It makes no sense for "keys" to be lazy, because they need to be inspected to determine the shape of the data structure. (Except in the case of a singleton map, if you know where in the tree some (key,value) pair goes, then you've already evaluated the key.) - It's better for "values" to be lazy in a general-purpose map-type data structure, because making them strict breaks Functor. So the remaining interesting question is the internal structure pointers. > I trust the library writers when they put > strictness annotations in the fields of the tree nodes, so I'm wondering > what the various implications of lazy subtrees are, beyond the obvious speed > hit. Does this cause space leaks beyond what lazy value pointers can already > cause? Can someone point me to some reading that discusses this? Yes, please read Chris Okasaki's "Purely Functional Data Structures" for a fuller discussion of the tradeoffs of putting laziness in different places in a data structure. Making internal pointers strict vs making them lazy doesn't necessarily buy you much in the way of raw-cycle-counting performance. What it buys you is a complexity _guarantee_, which in Haskell is often more valuable. Thinking of a binary search tree for a moment, making the internal pointers lazy means that insertion is always O(1), but lookup may take an arbitrary amount of time (though it will be O(log n) amortised). It also adds a raw-cycle-counting cost to every lookup, even if the tree is fully evaluated. This is the opposite of the usual desired performance. Dictionary implementations tend to assume that lookups are more common than insertions and deletions, and correspondingly, clients tend to assume that insertions and deletions are more expensive than lookups. If these assumptions don't match your code, then indeed, you may be using the wrong data struture. > Looking at the first version, clearly you see that when constructing a new > map you should only have to pay for the sub trees that you actually use. If > you perform only a handful of lookups then throw the new tree away, why > build the whole thing? If you only perform a handful of lookups, I question whether you actually wanted a binary search tree to begin with. Wouldn't an association list have done the job just as well? Or compiling to functions? > To further motivate, let me explain my use case. [...] > So I memoize the thing into something like > "data Node a = Node a [Node a]" Right. Memoising CAFs is an excellent example of one of those very few places where writing your own dictionary data type can make a lot of sense. Why? Because there are a lot of things that you don't need from, say, AVL trees. For example, you know all the keys in advance, which means that your memo table won't need to be modified once it's built. You don't even need insertions, let alone deletions or a Functor instance. Have you tried just using functions? Something like this: -- WARNING: Untested code follows. Use at own risk. type MyMap k v = k -> v -- k -> Maybe v may also be appropriate. myMapFromSortedAssocList :: (Ord k) => [(k,v)] -> MyMap k v myMapFromSortedAssocList kvs = buildMap (length kvs) kvs where errorCase = error "Can't find key" -- Feel free to optimise additional base cases if desired. buildMap _ [] = \key -> errorCase buildMap _ [(k,v)] = \key -> if k == key then v else errorCase buildMap l kvs = let l2 = l `div` 2 (kvsl,(k,v):kvs2) = splitAt l2 kvs mapl = buildMap l2 kvs1 mapr = buildMap (l - l2 - 1) kvs2 in \key -> case compare key k of LT -> mapl key GT -> mapr key EQ -> v (Exercise for intermediate-level Haskellers: Why is "key" bound by an explicit lambda?) Cheers, Andrew Bromage From sedillard at ucdavis.edu Mon Aug 4 21:17:57 2008 From: sedillard at ucdavis.edu (Scott Dillard) Date: Mon Aug 4 21:17:41 2008 Subject: Proposal: add laziness to Data.Map / IntMap In-Reply-To: <20080805001856.GC14051@scytale.galois.com> References: <20080805001856.GC14051@scytale.galois.com> Message-ID: I think maybe you guys (Don and Andrew) are misunderstanding my proposal. The lazy/strict tradeoff is a subtle issue, and I'll be sure to re-read Okasaki's stuff with this in mind, but what I'm talking about here is not a trade off. It's laziness "for free". Move the strictness annotations out of the constructors and into the library functions using 'seq'. Laziness is exposed through _separate_ functions. I'll copy again my proposed versions of mapWithKey (because I made a typo the first time) mapWithKey :: (k -> a -> b) -> Map k a -> Map k b mapWithKey _ Tip = Tip mapWithKey f (Bin sx kx x l r) = seq l' $ seq r' $ Bin sx kx (f kx x) l' r' where l' = mapWithKey f l r' = mapWithKey f r mapWithKeyLazy _ Tip = Tip mapWithKeyLazy f (Bin sx kx x l r) = Bin sx kx (f kx x) (mapWithKey f l) (mapWithKey f r) So mapWithKey retains all semantics, including guarantees. So would insert, and all other functions. You export a second API (maybe in a nested module) that exposes laziness. Writing another library is kind of silly since a) you want stricness 90% of the time b) it shares 90% of the code. If maintainers are willing to deal with some extra seqs here and there then we can have both libraries in one. Scott On Mon, Aug 4, 2008 at 6:18 PM, Don Stewart wrote: > sedillard: > > Hi, > > > > I found myself wanting a lazy version of Data.Map today, and by "lazy" > I > > mean in the node-subtree pointers. I trust the library writers when > they > > put strictness annotations in the fields of the tree nodes, so I'm > > wondering what the various implications of lazy subtrees are, beyond > the > > obvious speed hit. Does this cause space leaks beyond what lazy value > > pointers can already cause? Can someone point me to some reading that > > discusses this? > > > > Anyway, I'm positing to libraries (rather than haskell-cafe) to gauge > > opinion about a possible rewrite of Data.Map and IntMap to remove > > strictness annotations (bangs) from the node constructors and move > them > > into the functions (as seqs). "Rewrite" is maybe too strong of a > word. > > "Significant patch" is more like it. It would involve only those > functions > > that construct Bin values directly, which is not that many. Less than > a > > days work, I think (yes that means I volunteer.) Semantics of > everything > > remains unchanged, but it opens up the possibility for lazy versions > of > > some functions. > > How about doing it as a separate library, then we can choose either > strict or lazy as the case may be? > > -- Don > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080804/f5e6e5d3/attachment.htm From bertram.felgenhauer at googlemail.com Mon Aug 4 22:53:20 2008 From: bertram.felgenhauer at googlemail.com (Bertram Felgenhauer) Date: Mon Aug 4 22:53:10 2008 Subject: Proposal: add laziness to Data.Map / IntMap In-Reply-To: References: <20080805001856.GC14051@scytale.galois.com> Message-ID: <20080805025320.GB4185@zombie.inf.tu-dresden.de> Scott Dillard wrote: > I think maybe you guys (Don and Andrew) are misunderstanding my proposal. > The lazy/strict tradeoff is a subtle issue, and I'll be sure to re-read > Okasaki's stuff with this in mind, but what I'm talking about here is not a > trade off. It's laziness "for free". Move the strictness annotations out of > the constructors and into the library functions using 'seq'. Laziness is > exposed through _separate_ functions. It's not for free. When the compiler does a pattern match on the Bin constructor, Bin sz kx x l r, it can no longer assume that l and r are fully evaluated, so it has to add code to evaluate them in case they are not. And in fact, this code will be needed if any of your proposed lazy functions are added later. I have not checked whether this has a measurable performance or code size impact. > So mapWithKey retains all semantics, including guarantees. Semantically the change is safe, agreed. regards, Bertram From jyasskin at gmail.com Tue Aug 5 01:37:35 2008 From: jyasskin at gmail.com (Jeffrey Yasskin) Date: Tue Aug 5 01:37:19 2008 Subject: Where should cabal install to by default? In-Reply-To: <48978559.5080803@charter.net> References: <1217790662.7661.66.camel@localhost> <1217845608.7661.100.camel@localhost> <48978559.5080803@charter.net> Message-ID: <5d44f72f0808042237r10fe1a90uab8306f8ca3c21cd@mail.gmail.com> On Mon, Aug 4, 2008 at 3:40 PM, Isaac Dupree wrote: > Duncan Coutts wrote: >> >> On Linux I'm not sure where else we could >> put things that's per-user. People seem to object strenuously to >> programs automatically putting files in any dir other than ~/.progname >> (or one or two other .files established by specification or convention). > > just a thought: > > If you're into this standard > http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html > , we could use $XDG_CACHE_HOME/cabal or $XDG_DATA_HOME/cabal by default > (depending whether you think what Cabal installs is just a cache, or not) ( > the default values to use for those if not defined are $HOME/.cache and > $HOME/.local/share ). > Various things, not all X-related, use that convention/standard; the > search-paths ($XDG_DATA_DIRS, $XDG_CONFIG_DIRS) can come in handy to > separate my own configuration from whatever's been thrown in dot-dirs too. > > -Isaac FWIW, Python recently went through this exercise and produced http://python.org/dev/peps/pep-0370/, which does use the $HOME/.local directory. I don't know how much that should affect Cabal's choice, but it's a bit of prior art if you're interested. Jeffrey From lemming at henning-thielemann.de Tue Aug 5 07:04:35 2008 From: lemming at henning-thielemann.de (Henning Thielemann) Date: Tue Aug 5 07:04:43 2008 Subject: runCommand/waitForProcess don't respect text printing order whenstdout is redirected In-Reply-To: <48933119.3020903@imn.htwk-leipzig.de> References: <6a7c66fc0808010713q1df4d9a3j95f492f563634376@mail.gmail.com> <125EACD0CAE4D24ABDB4D148C4593DA9049E94F8@GBLONXMB02.corp.amvescap.net> <6a7c66fc0808010732t3d35a0afxcb56a3590799205b@mail.gmail.com> <48933119.3020903@imn.htwk-leipzig.de> Message-ID: On Fri, 1 Aug 2008, Johannes Waldmann wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > >> hSetBuffering stdout LineBuffering > > I run into this from time to time as well. > I think the ghc (runtime) runtime behaviour is confusing here, > as it tries to be too clever. I don't think it should > even try to find out whether it's writing to terminal or file. As far as I know it's the same behaviour as that of C standard libraries. From lemming at henning-thielemann.de Tue Aug 5 08:19:59 2008 From: lemming at henning-thielemann.de (Henning Thielemann) Date: Tue Aug 5 08:19:42 2008 Subject: Functor instance of ReaderT Message-ID: The Functor instance of ReaderT has the header: Monad m => Functor (ReaderT r m) I thought (Functor m) must be the appropriate constraint. From leather at cs.uu.nl Tue Aug 5 10:46:50 2008 From: leather at cs.uu.nl (Sean Leather) Date: Tue Aug 5 10:46:31 2008 Subject: [Hs-Generics] Developing SYB, packaging issues In-Reply-To: <52f14b210808040449o4189511k1cbe6d430b6fb6da@mail.gmail.com> References: <52f14b210808040449o4189511k1cbe6d430b6fb6da@mail.gmail.com> Message-ID: <3c6288ab0808050746p4b5bd85aqe2639318ef3a5585@mail.gmail.com> Hi Pedro, To be able to further develop SYB (see [1] for history), it's probably best > to develop it as a separate package, which people can install, upgrade, etc. > This would mean the library could be updated independently of GHC, and new > GHC releases could then use the most recent version of the package. > Agreed. We talked about this offline, but I wanted to chime in with a +1 here. Just as it would help the GHC developers for a third party to develop and maintain SYB, it would help the developers and users for SYB to be distributed and available as a separate package. > But how do these things merge? Can/should SYB be moved out of the base > package? And, if this happens, can the library being developed as a separate > package still use the automatic deriving mechanism? > I think SYB should be extracted from 'base' into a package. It seems like the only technical thing that might prevent this extraction is the automatic deriving of Typeable and Data. Or does it prevent it? Can anyone clarify this? As a side note, might this work have an impact on the Haskell Platform? Thanks, Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080805/996a868c/attachment.htm From benja.fallenstein at gmail.com Tue Aug 5 10:51:28 2008 From: benja.fallenstein at gmail.com (Benja Fallenstein) Date: Tue Aug 5 10:51:09 2008 Subject: Functor instance of ReaderT In-Reply-To: References: Message-ID: On Tue, Aug 5, 2008 at 2:19 PM, Henning Thielemann wrote: > > The Functor instance of ReaderT has the header: > Monad m => Functor (ReaderT r m) > > I thought (Functor m) must be the appropriate constraint. Note that there *is* an advantage to having Monad as the context: not every monad implements Functor, even though it could. So having Functor as the context isn't *trivially* superior. Taking this into consideration, and taking into consideration that ReaderT is usually thought of as related to monads (e.g., it's in Control.Monad.Reader), I'm not sure that the preferred way of transforming a functor wouldn't be (r ->) :.: f where :.: is functor composition. But then, functor composition apparently isn't actually in the library, which surprised me (it's in the applicative functor paper, but not in Control.Applicative), so maybe changing ReaderT to be usable with functors is more warranted. Or perhaps adding functor composition is the way to go? All the best, - Benja From sedillard at ucdavis.edu Tue Aug 5 10:56:54 2008 From: sedillard at ucdavis.edu (Scott Dillard) Date: Tue Aug 5 10:56:37 2008 Subject: Proposal: add laziness to Data.Map / IntMap Message-ID: Hi Bertram Scott Dillard wrote: > > I think maybe you guys (Don and Andrew) are misunderstanding my proposal. > > The lazy/strict tradeoff is a subtle issue, and I'll be sure to re-read > > Okasaki's stuff with this in mind, but what I'm talking about here is not > a > > trade off. It's laziness "for free". Move the strictness annotations out > of > > the constructors and into the library functions using 'seq'. Laziness is > > exposed through _separate_ functions. > > It's not for free. When the compiler does a pattern match on the Bin > constructor, Bin sz kx x l r, it can no longer assume that l and r are > fully evaluated, so it has to add code to evaluate them in case they are > not. And in fact, this code will be needed if any of your proposed lazy > functions are added later. I have not checked whether this has a > measurable performance or code size impact. > > > So mapWithKey retains all semantics, including guarantees. > > Semantically the change is safe, agreed. You're right of course (hadn't thought about that) but I'd like to point out that Adrien Hey's AVL tree library is structured in the way I propose: A node's subtree fields are not marked strict, instead seq is used to force construction of the tree. I think it's generally faster than Data.Map, at least it's claimed to be so in the documentation. The trees are not built with the same algorithm, but if the use of seq instead of bangs does impose a significant overhead, then that speaks very highly to the AVL algorithm. This of course raises the question of why I don't just use that library. I'm checking it out, but it's quite a bit bigger than Data.Map. The thing I like about Data.Map is that it's relatively simple and I can see whats going on very quickly, enough to make modifications (which I'm about to do.) Reading Data.Tree.AVL is a more daunting task. Also, there's something to be said for "lazy by default." I don't see much difference between Data.List.map and Data.Map.map (for finite lists anyway) and so if laziness is the correct default for the former then it should also be for the latter. The current Data.Map.map is only half strict: it builds an entire tree and fills it with thunks. Better I think to either build the whole thing completely or build only what's needed. The current Data.Map.map is the worst of both, no laziness + heap burn. So I guess I'll start writing. Anyone have any good benchmarks? Which naming scheme is less ugly, Data.Map.mapWithKeyL or Data.Map.Lazy.mapWithKey? Any other suggestions would be appreciated. Scott -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080805/5e662d3f/attachment.htm From ganesh.sittampalam at credit-suisse.com Tue Aug 5 11:12:39 2008 From: ganesh.sittampalam at credit-suisse.com (Sittampalam, Ganesh) Date: Tue Aug 5 11:12:57 2008 Subject: Proposal: add laziness to Data.Map / IntMap In-Reply-To: References: Message-ID: <78A3C5650E28124399107F21A1FA419401D3B78C@ELON17P32002A.csfb.cs-group.com> > Which naming scheme is less ugly, > Data.Map.mapWithKeyL or Data.Map.Lazy.mapWithKey? A separate module is much better, as it will allow switching entire modules over just by changing an import, and still allow something close to the mapWithKeyL usage via a qualified import (e.g. "import qualified Data.Map.Lazy as L") Ganesh ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ============================================================================== From ahey at iee.org Tue Aug 5 13:22:04 2008 From: ahey at iee.org (Adrian Hey) Date: Tue Aug 5 13:21:47 2008 Subject: Proposal: add laziness to Data.Map / IntMap In-Reply-To: References: Message-ID: <48988C3C.1020203@iee.org> Scott Dillard wrote: > You're right of course (hadn't thought about that) but I'd like to point out > that Adrien Hey's AVL tree library is structured in the way I propose: A > node's subtree fields are not marked strict, instead seq is used to force > construction of the tree. I think it's generally faster than Data.Map, at > least it's claimed to be so in the documentation. The trees are not built > with the same algorithm, but if the use of seq instead of bangs does impose > a significant overhead, then that speaks very highly to the AVL algorithm. Using explicit seqs rather than strict data types is actually faster, for reasons that are a bit of a mystery to me. I'm not sure what cost Bertram is talking about, but AFAIK ghc uses the same info pointer mechanism for all heap records, including unevaluated thunks (although the info pointers will point to different things of course). But the cost of pattern matching on *evaluated* AVL nodes should be independent of strictness annotation AFAICS. Regards -- Adrian Hey From igloo at earth.li Tue Aug 5 13:40:09 2008 From: igloo at earth.li (Ian Lynagh) Date: Tue Aug 5 13:39:52 2008 Subject: [Hs-Generics] Developing SYB, packaging issues In-Reply-To: <3c6288ab0808050746p4b5bd85aqe2639318ef3a5585@mail.gmail.com> References: <52f14b210808040449o4189511k1cbe6d430b6fb6da@mail.gmail.com> <3c6288ab0808050746p4b5bd85aqe2639318ef3a5585@mail.gmail.com> Message-ID: <20080805174009.GA1677@matrix.chaos.earth.li> On Tue, Aug 05, 2008 at 04:46:50PM +0200, Sean Leather wrote: > > I think SYB should be extracted from 'base' into a package. I'll be sending a message about this soon. Thanks Ian From haskell at list.mightyreason.com Tue Aug 5 13:48:31 2008 From: haskell at list.mightyreason.com (Chris Kuklewicz) Date: Tue Aug 5 13:48:23 2008 Subject: An even fancier Get monad, last in a series? In-Reply-To: <488FC298.2080205@list.mightyreason.com> References: <488FC298.2080205@list.mightyreason.com> Message-ID: <4898926F.9020501@list.mightyreason.com> And now MyGetW.hs [1] is even more capable: During the parsing one can use a "yieldItem y" command to queue y for delivery to the code running the Get monad. The length of this queue can be examined with "pendingItems" which returns an Int, which will be non-negative. To immediately pause parsing and return a Data.Sequence of yielded items to the user one calls "flushItems" which returns the sequence and a lazy value which is the future of the paused parsing. The pending queue of items is also returned every time the parser needs more input, sees an error, or completes. Thus the parser can be a lazy participant in a chain of processing. The queued items are undisturbed by fail/throwError/mzero all lookAhead* functions and any use of callCC and the continuations it returns. Thus "yieldItem" is for life and cannot be undone. The continuation returned by "callCC" was improved internally to not hold onto the old input state of the BinaryParser or the old user state. It does not use these things and should not keep them alive. Frankly, I cannot think of any more features to add. Once Haddock can be used with a released Cabal I may make a more proper release of these files. Cheers, Chris [1] MyGetW.hs and MyGet.hs and MyGetSimplified.hs are still at http://darcs.haskell.org/packages/protocol-buffers/Text/ProtocolBuffers/ From sedillard at ucdavis.edu Tue Aug 5 14:46:50 2008 From: sedillard at ucdavis.edu (Scott Dillard) Date: Tue Aug 5 14:46:32 2008 Subject: Proposal: add laziness to Data.Map / IntMap Message-ID: >Adrian Hey wrote: > > Using explicit seqs rather than strict data types is actually faster, > for reasons that are a bit of a mystery to me. I'm not sure what cost > Bertram is talking about, but AFAIK ghc uses the same info pointer > mechanism for all heap records, including unevaluated thunks (although > the info pointers will point to different things of course). But the > cost of pattern matching on *evaluated* AVL nodes should be independent > of strictness annotation AFAICS. Thanks for chiming in Adrian. Just to get started I removed the strictness annotations from the Data.Map Bin constructor, made no other changes, and ran a silly benchmark (at the end of this email). The version without bangs is actually faster than the version currently shipping. I get about 10.5 sec for the lazy version and 11.5 sec for the strict version (2.1Ghz Intel Core) I'll repeat that in bold for people just skimming this thread: __Removing Strictness Annotations Makes It Go Faster__ The reason I think is that the helper functions bin, join and balance already provide just enough strictness, as they need to inspect the size field. The strictness analyzer can then do its job. The case for IntMap is tricker, as there is no implicit strictness in the code so removing the bangs causes stack overflows. Still working on that one. Here are the benchmarks. The lazy version also evaluates "keySum dmap" slightly faster (repeated inserts) and its a tie for "keySum smap" (sequential inserts). I admit this benchmark is goofy, if you have a better one please share. Scott import qualified Data.Map as Map import Data.List as List n = 1000000 rkeys = [ (i*122789) `mod` 1006471 | i<-[0..] ] :: [Int] dkeys = map (`div`1000) rkeys :: [Int] skeys = [0..] :: [Int] shuffle (a:b:c:d:e:f:g:h:rest) = e:a:h:d:c:b:g:f: shuffle rest keySum = List.foldl' (+) 0 . Map.keys rmap = Map.fromList . take n . shuffle $ rkeys `zip` [0..] dmap = Map.fromList . take n . shuffle $ dkeys `zip` [0..] smap = Map.fromAscList . take n . shuffle $ skeys `zip` [0..] mix (x:xs) (y:ys) = x : y : mix xs ys mix _ [] = [] mix [] _ = [] rkeys2 = [ (i*122789) `mod` 1006471 | i<-[0..] ] :: [Int] rlooks = [ (i*122819) `mod` 1006471 | i<-[0..] ] :: [Int] rlook = List.foldl' (\k s -> case Map.lookup k rmap of Nothing -> s; Just x -> s+x) 0 (take n $ rkeys2 `mix` drop 1000 rlooks) main = print rlook -- or print (keySum dmap) or whatever -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080805/af28bfdc/attachment-0001.htm From jamiiecb at googlemail.com Tue Aug 5 15:03:23 2008 From: jamiiecb at googlemail.com (Jamie Brandon) Date: Tue Aug 5 15:03:06 2008 Subject: Proposal: add laziness to Data.Map / IntMap In-Reply-To: References: Message-ID: <10ed1a750808051203g426c182dk33c0153bb23408ef@mail.gmail.com> > This of course raises the question of why I don't just use that library. I'm > checking it out, but it's quite a bit bigger than Data.Map. The thing I like > about Data.Map is that it's relatively simple and I can see whats going on > very quickly, enough to make modifications (which I'm about to do.) Reading > Data.Tree.AVL is a more daunting task. I agree. Have a look at code.haskell.org/gmap/api . The Data.GMap.OrdMap module is a wrapper around AvlTree, using a slightly more complete interface than Data.Map . It should make an easier starting point. In fact, I suspect that if you just make a new AvlTreeL by deleting every occurrence of `seq` and then OrdMapL by changing the import in OrdMap everything would work seamlessly. The current version of OrdMap should be fairly safe to use. I'll be putting a proper package on hackage in a week or two once Ive tidied everything up. Jamie From ahey at iee.org Wed Aug 6 03:35:34 2008 From: ahey at iee.org (Adrian Hey) Date: Wed Aug 6 03:35:15 2008 Subject: Proposal: add laziness to Data.Map / IntMap In-Reply-To: References: Message-ID: <48995446.3030602@iee.org> Hello Scott, Scott Dillard wrote: > Thanks for chiming in Adrian. Just to get started I removed the strictness > annotations from the Data.Map Bin constructor, made no other changes, and > ran a > silly benchmark (at the end of this email). The version without bangs is > actually faster than the version currently shipping. I get about 10.5 sec > for > the lazy version and 11.5 sec for the strict version (2.1Ghz Intel Core) > > I'll repeat that in bold for people just skimming this thread: > > __Removing Strictness Annotations Makes It Go Faster__ > Interesting. This seems to be more or less in agreement with my tests on AVL insertion. I get something like 10-20% speedup by *not* using strict constructors (providing I add explicit seqs if needed to give me the strictness I want). So it seems strict constructors cause some unnecessary work or inhibit some optimisation (or something..). Regards -- Adrian Hey From bertram.felgenhauer at googlemail.com Wed Aug 6 03:57:59 2008 From: bertram.felgenhauer at googlemail.com (Bertram Felgenhauer) Date: Wed Aug 6 03:57:46 2008 Subject: Proposal: add laziness to Data.Map / IntMap In-Reply-To: <48988C3C.1020203@iee.org> References: <48988C3C.1020203@iee.org> Message-ID: <20080806075758.GA4224@zombie.inf.tu-dresden.de> Adrian Hey wrote: > Using explicit seqs rather than strict data types is actually faster, > for reasons that are a bit of a mystery to me. I'm not sure what cost > Bertram is talking about, but AFAIK ghc uses the same info pointer > mechanism for all heap records, including unevaluated thunks (although > the info pointers will point to different things of course). But the > cost of pattern matching on *evaluated* AVL nodes should be independent > of strictness annotation AFAICS. I must admit that for the time being the cost is of a theoretical nature. But let me explain the idea. Consider this code: > module Nat (isOne) where > > data Nat = Succ !Nat | Zero > > isOne :: Nat -> Bool > isOne n = case n of > Zero -> False > Succ n' -> case n' of > Zero -> True > Succ _ -> False The code of isOne starts by forcing n (looking at n's tag and entering the closure if it's unevaluated in ghc's case) and then a pattern match (looking at the the tag again). Now for the second pattern match, we can skip the first step, because we know that n' is fully evaluated, thanks to the strictness annotation in the Succ constructor. However, ghc currently does generate the same (cmm) code for isOne regardless of the strictness annotation, so performance wise you only get to pay the price of the annotation (I expect that some thunks are unnecessarily reevaluated when the constructor is used) without this benefit. Did I miss any reason why this idea can't work? I really expected ghc to do that optimisation - obviously that was wishful thinking on my part. Bertram From chak at cse.unsw.edu.au Wed Aug 6 07:03:07 2008 From: chak at cse.unsw.edu.au (Manuel M T Chakravarty) Date: Wed Aug 6 07:03:04 2008 Subject: Where should cabal install to by default? In-Reply-To: <1217845608.7661.100.camel@localhost> References: <1217790662.7661.66.camel@localhost> <1217845608.7661.100.camel@localhost> Message-ID: <705B648C-7B84-4623-A6B7-1587D2CAFF84@cse.unsw.edu.au> Duncan Coutts: > On Mon, 2008-08-04 at 11:56 +1000, Manuel M T Chakravarty wrote: > >> Just some random remarks: > > Thanks for the useful feedback on OSX conventions. I was unaware about > several of them. Also have a look at http://developer.apple.com/tools/installerpolicy.html In particular, it says If your software is multi-platform, try to accomodate your Mac OS X users by not using /etc, /usr/local, and so on, unless your software is only accessible via the command-line. So, the appropriate install location is also dependent on the type of application. >> * Hiding installed files in a . directory is very bad style IMHO. I >> think that should never happen. > > Just on OSX or on Unix too? On Linux I'm not sure where else we could > put things that's per-user. People seem to object strenuously to > programs automatically putting files in any dir other than ~/.progname > (or one or two other .files established by specification or > convention). I think an installer is in a somewhat special situation here. It's not that creating files is a side-effect that the user doesn't care about. Installing files is the primary purpose of an installer. And whenever programs are to automagically appear in the PATH, you either have to modify the contents of one of the directories in the PATH or you have to alter the PATH setting (which is worse). >> Independent of whether you install right into /usr/local/bin or >> whether you symlink or whatever. You might install files >> under /usr/local/lib/cabal and then symlink, but probably its nicer >> to >> installto /usr/local/lib/- and then symlink. > > For global installs we use prefx=/usr/local at the moment, so that > mean /usr/local/lib/- for libs and /usr/local/bin > for binaries. I've just implemented the symlinking feature so we could > change the default on global installs to use the version suffix on > binaries and symlink back into /usr/local/bin. > > Or do you mean we should install binaries > into /usr/local/lib(exec?)/- and then symlink > into /usr/local/bin. That would also be reasonable I suppose, though > then the versioned binaries are not on the path. I'd prefer to only put symlinks into /usr/local/bin or similar. It makes it easier to remove things. You can still put symlinks to versioned binaries there. (That's what the Mac OS installer for GHC does, for example.) >> * On OS X, its not generally appropriate to install into /usr/local >> either. Each user has ~/Applications and ~/Library directories that >> are usually used for per-user installs. Just generally using sudo >> and >> put binaries into /usr/local is also bad because not every user will >> have admin rights on the machine. > > Oh, great. I didn't know OS X had a standard location for per-user > installs. That's excellent. So is there a preferred layout for those > dirs? I'm guessing there probably is. OS X has this concept of bundles. The idea is that all installes that come with an application are in one directory (instead of scattered across the system). So, in /Applications and ~/Applications, there is just one dir per app. Applications that need to store data, do that in again just one dir under ~/Library and they may have another dir with preferences under ~/Library/Preferences. For example, the version of Emacs I am using, AquaEmacs, uses ~/ Library/Preferences/Aquamacs\ Emacs instead of .emacs. >> * ~/.cabal is bad on Mac OS, too. Preferences ought to go into ~/ >> Library/Preferences/ > > Ok. BTW, in that case we should probably fix > System.Directory.getAppUserDataDirectory to follow the system > convention > on OSX. Currently it uses $HOME/.appname on all unix systems (and the > Windows convention on Windows). If necessary we may want to > split getAppUserDataDirectory into a variant for config and another > for > data. > > Since cabal-install is a program should it still be > using ~/Library/Preferences/ or is there are > corresponding ~/Applications/Preferences/ ? Where would be > appropriate for cabal-install put its download cache and build logs? There is no ~/Applications/Preferences, only ~/Library/Preferences. However, to be fair, some command-line applications with a Unix origin use '.appname' (eg, ghc). Manuel From usenet at mkarcher.dialup.fu-berlin.de Wed Aug 6 09:17:30 2008 From: usenet at mkarcher.dialup.fu-berlin.de (Michael Karcher) Date: Wed Aug 6 09:17:18 2008 Subject: An even fancier Get monad, last in a series? References: <488FC298.2080205@list.mightyreason.com> <4898926F.9020501@list.mightyreason.com> Message-ID: Chris Kuklewicz wrote: > The length of this queue can be examined with "pendingItems" which returns an > Int, which will be non-negative. Wouldn't it be more in the spirit of Haskell to use a data type that has "non-negative" already built in, like Data.Word.Word? Regards, Michael Karcher From isaacdupree at charter.net Wed Aug 6 11:24:58 2008 From: isaacdupree at charter.net (Isaac Dupree) Date: Wed Aug 6 11:24:44 2008 Subject: Where should cabal install to by default? In-Reply-To: <5d44f72f0808042237r10fe1a90uab8306f8ca3c21cd@mail.gmail.com> References: <1217790662.7661.66.camel@localhost> <1217845608.7661.100.camel@localhost> <48978559.5080803@charter.net> <5d44f72f0808042237r10fe1a90uab8306f8ca3c21cd@mail.gmail.com> Message-ID: <4899C24A.20403@charter.net> Jeffrey Yasskin wrote: > FWIW, Python recently went through this exercise and produced > http://python.org/dev/peps/pep-0370/, which does use the $HOME/.local > directory. I don't know how much that should affect Cabal's choice, > but it's a bit of prior art if you're interested. It seems $HOME/.local/bin is shared between all programs, unlike e.g. $HOME/.local/lib/python2.6/ ? Of course this is how /usr/local works already. But I wonder if we should put the actual binaries somewhere else, and just put symlinks in that "bin" directory. What if the pythoners accidentally call some program "happy" and it conflicts with the binary derived from haskell's Hackage? (Of course that would be a problem for other reasons, but it's worse if it requires overwriting unknown binaries with binaries rather than symlinks vs. symlinks.) -Isaac From sedillard at ucdavis.edu Wed Aug 6 11:51:51 2008 From: sedillard at ucdavis.edu (Scott Dillard) Date: Wed Aug 6 11:51:30 2008 Subject: Library submission process for tweaks / bugfixes Message-ID: Hi, The library submission process wiki page says to create a ticket and post it to the list when proposing interface changes, but what is the proper method for submitting bug fixes and tweaks? This post is both an inquiry into the library patching process, and a friendly reminder of some outstanding submissions of mine. I'm watching http://darcs.haskell.org/packages/containers and I don't see anything related to the following : 1) There is an egregious and program-breaking typo, already patched and languishing on trac : http://hackage.haskell.org/trac/ghc/ticket/2359 . IMO that needs to be pushed right now. (milestone 6.10? Whats the point of a separate containers repo?) 2) Months ago I submitted a small patch to fix a too-restrictive type signature : http://www.haskell.org/pipermail/libraries/2008-May/009677.html 3) and another to improve performance of findMin / findMax : http://www.haskell.org/pipermail/libraries/2008-May/009687.html 4) and another, with the help of Bertram F, to improve the runtime complexity of fromAscList for IntSet and IntMap : http://www.haskell.org/pipermail/libraries/2008-May/009685.html I was hoping for some kind of binary response, "ok, pushed" or "no thanks." I have more of these kinds of changes that I plan on making. Whom do I email? Should I use all caps or what? :) Clearly its not sufficient to say "propose interface changes on trac, post bugfixes to the list" because either no one with push power is reading the list, or they get their "to do" items from trac only. Which is something of a problem, because a mailing list can't generate "to do" items. On trac someone with access rights flags a ticket, but nothing like that happens on the list. And even if the proper place for submissions is trac, there is no "libraries/containers" component listed, and many library submissions are flagged "Not GHC" which sounds to me like a dismissal. Is containers still "GHC"? As I understand it, containers is a free-standing project and this list is its collective maintainer. (Which I don't really see as a solution.) Scott -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080806/4de557a8/attachment.htm From igloo at earth.li Wed Aug 6 16:37:00 2008 From: igloo at earth.li (Ian Lynagh) Date: Wed Aug 6 16:36:41 2008 Subject: PROPOSAL: More base package breakup Message-ID: <20080806203700.GA18316@matrix.chaos.earth.li> Hi all, This is trac #1338: http://hackage.haskell.org/trac/ghc/ticket/1338#comment:14 http://hackage.haskell.org/trac/ghc/attachment/ticket/1338/packagegraph.png Initial deadline: 21 Aug (2 weeks). The base package is still a large, unwieldy beast, making it hard to develop and debug. If possible, I'd like to cut it down a bit more before the 6.10 release. I won't inline all the details here, as it's a huge amount of text and an image, but basically I'm proposing to: * Create packages: timeout, unique, concurrent, st, system, numeric, generics, version, getopt, debug, printf ghc-exts * Merge what I've listed as "control" into "containers" There's definitely a "foreign" package fighting to get out too, but that needs more work before we can set it free. An important point to note is that Simon Marlow has made a base 3 compatibility library that will come with GHC 6.10, and will provide the same interface as the base library that came with 6.8, so breaking backwards compatibility in base 4 shouldn't cause large problems like the base 2 to base 3 change did. Thanks Ian From leather at cs.uu.nl Wed Aug 6 17:38:05 2008 From: leather at cs.uu.nl (Sean Leather) Date: Wed Aug 6 17:37:44 2008 Subject: PROPOSAL: More base package breakup In-Reply-To: <20080806203700.GA18316@matrix.chaos.earth.li> References: <20080806203700.GA18316@matrix.chaos.earth.li> Message-ID: <3c6288ab0808061438t6b8bcfa0qd8898477df3aaf10@mail.gmail.com> Hi Ian, The base package is still a large, unwieldy beast, making it hard to > develop and debug. If possible, I'd like to cut it down a bit more > before the 6.10 release. > > I won't inline all the details here, as it's a huge amount of text and > an image, but basically I'm proposing to: > > * Create packages: > timeout, unique, concurrent, > st, > system, numeric, generics, > version, getopt, debug, printf > ghc-exts I have a very strong request to name the "generics" package something else. The reason is that this not the only flavor of generic programming. Many others are available and/or will be made available. (We [1] are working on a few.) For example, if you go to the Hackage package list [2] and grep for "generic," you'll find a number of packages, not to mention Uniplate and Strafunski (a.k.a. StrategyLib). If you go to the wiki on research papers in the area of generics [3], you find a lot. Only 2 of those cover the Data.Generics library. Naming the package "generics" is deceiving, because it conveys that it is the only (or main) generics library. I would recommend calling it "syb," because that is its more popular name based on the research that has been done, esp. considering the continuing use of "Scrap Your ..." in titles. ;) As a side note, it appears that I am not the first person to say this. [4] Thanks, Sean [1] http://www.cs.uu.nl/wiki/bin/view/GenericProgramming/WebHome [2] http://hackage.haskell.org/packages/archive/pkg-list.html [3] http://www.haskell.org/haskellwiki/Research_papers/Generics [4] http://hackage.haskell.org/trac/ghc/ticket/1338#comment:3 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080806/8760fb3f/attachment.htm From igloo at earth.li Wed Aug 6 17:45:09 2008 From: igloo at earth.li (Ian Lynagh) Date: Wed Aug 6 17:44:47 2008 Subject: PROPOSAL: More base package breakup In-Reply-To: <3c6288ab0808061438t6b8bcfa0qd8898477df3aaf10@mail.gmail.com> References: <20080806203700.GA18316@matrix.chaos.earth.li> <3c6288ab0808061438t6b8bcfa0qd8898477df3aaf10@mail.gmail.com> Message-ID: <20080806214509.GA546@matrix.chaos.earth.li> Hi Sean, On Wed, Aug 06, 2008 at 11:38:05PM +0200, Sean Leather wrote: > > Naming the package "generics" is deceiving, because it conveys that it is > the only (or main) generics library. I would recommend calling it "syb," "syb" is fine with me. Thanks Ian From lemming at henning-thielemann.de Thu Aug 7 03:22:19 2008 From: lemming at henning-thielemann.de (Henning Thielemann) Date: Thu Aug 7 03:22:40 2008 Subject: Functor instance of ReaderT In-Reply-To: References: Message-ID: On Tue, 5 Aug 2008, Benja Fallenstein wrote: > On Tue, Aug 5, 2008 at 2:19 PM, Henning Thielemann > wrote: > > > > The Functor instance of ReaderT has the header: > > Monad m => Functor (ReaderT r m) > > > > I thought (Functor m) must be the appropriate constraint. > > Note that there *is* an advantage to having Monad as the context: not > every monad implements Functor, even though it could. So having > Functor as the context isn't *trivially* superior. I know, but if a Monad instance exists, then the Functor instance can be simply added, whereas when the opposite direction is not always possible. So, lifting the constraint from Monad to Functor can break existing packages, but the problems are "easy to fix" (TM). :-) From igloo at earth.li Thu Aug 7 11:21:11 2008 From: igloo at earth.li (Ian Lynagh) Date: Thu Aug 7 11:20:48 2008 Subject: Library submission process for tweaks / bugfixes In-Reply-To: References: Message-ID: <20080807152111.GA27973@matrix.chaos.earth.li> Hi Scott, On Wed, Aug 06, 2008 at 09:51:51AM -0600, Scott Dillard wrote: > > The library submission process wiki page says to create a ticket and post it > to the list when proposing interface changes, but what is the proper method > for submitting bug fixes and tweaks? If it's a simple bug fix, which doesn't affect the API and just fixes obviously-broken behaviour in the obviously-correct way, then just sending a patch is fine (ideally putting it in trac, so it doesn't get missed). If you're changing the interface or intended behaviour then I think it should go through the library submission process. > 1) There is an egregious and program-breaking typo, already patched and > languishing on trac : http://hackage.haskell.org/trac/ghc/ticket/2359 . IMO > that needs to be pushed right now. (milestone 6.10? Whats the point of a > separate containers repo?) 6.10.1 doesn't mean "Cannot be fixed before GHC 6.10.1 is released", but "We want to fix this very soon, and ideally before we release GHC 6.10.1". It takes a few minutes to: * look at the patch and check it looks OK * add a test to the testsuite, if applicable * check it validates, and make any necessary changes if not and, while it is only a few minutes, there are a thousand other things that we need to do that also only take a few minutes, and there are only so many minutes in the day. But if it is in trac then we should get to it. In the particular case of this ticket, I said: It would be very useful if someone could put the datatype invariants as comments in the code. and that would still be very useful for the "look at the patch and check it looks OK" step; it's hard to see if the patch is correct if you don't understand the code! > there is no "libraries/containers" component listed, Use "libraries (other)". > and many library submissions are > flagged "Not GHC" which sounds to me like a dismissal. "Not GHC" is for tickets that aren't tied to GHC releases, e.g. bugs in extralibs or Visual Haskell. I'm hoping that when the Haskell Platform gets going there will be another bug tracker that we can move the extralibs tickets to, where they might get more attention. If there are any GHC or bootlibs bugs in there then they should probably actually be in _|_ instead. Thanks Ian From gwern0 at gmail.com Sat Aug 9 11:51:01 2008 From: gwern0 at gmail.com (Gwern Branwen) Date: Sat Aug 9 11:52:00 2008 Subject: Unhelpful library documentation Message-ID: <20080809155101.GA13213@craft> I don't know if everyone has seen this, so I thought I'd mention this: > "Haskell API docs suck. A lot." "Haskell API documentation is very lacking for newbies. For instance, I want to understand how to create and use regexes. If you start at Text.Regex.Posix documentation, it tells you that =~ and =~~ are the high level API, and the hyperlinks for those functions go to Text.Regex.Posix.Wrap, where the main functions are not actually documented at all!" "Since Haskell libraries are almost always implemented by Haskell gurus, and they implement them with themselves in mind (I have no objection to this, they are enthusiasts working for free), they use lots of clever code and advanced Haskell techniques. But this means that if you want people to actually use these libraries (and by consequence Haskell itself), the documentation for Haskell libraries has to be about an order of magnitude better than anything you'd find anywhere else. I suspect it is at least an order of magnitude worse than for something like .NET APIs, which means that relatively speaking the documentation of Haskell is currently in an absolutely dire state." These are all interesting points, but I found most interesting the conclusion: "Moving forward, I guess one problem is contributing to a library's documentation. There is nothing on the API doc pages that shows you how to do this. I suspect you need to check out the source with darcs (not something I do normally, I just use cabal) and then start email patches or something. Even then, I don't know if I would contribute any documentation -- 'howto' style documentation seems out of place on the API pages, but it is desperately needed." So, what can be done here? Offhand, I want to suggest adding a link to some sort of tutorial module or wiki page. Each page *does* mention that 'Maintainer libraries@haskell.org', but this really isn't helpful; it doesn't tell you where to get the original library source code, how to edit, what libraries@ *is* (anyone with a brain in their head is going to know that libraries@ is some sort of group interface, and is going to refrain from emailing it until they know that they aren't cluelessly spamming/offending potentially hundreds of Haskellers), and so on. Even better would be a link in the Description to the source: 'Control.Concurrent.QSemN is a module in the concurrent package; you can 'darcs get' it from http://foo.... For contributing, please module Contributing/wiki page haskell.org/wiki/Contributing_to_libraries', etc. Thoughts? -- gwern plutonium industrial UHF Reaction and data POCSAG Finksburg Merlin fake -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://www.haskell.org/pipermail/libraries/attachments/20080809/8b267fcd/attachment.bin From magnus at therning.org Sat Aug 9 16:33:01 2008 From: magnus at therning.org (Magnus Therning) Date: Sat Aug 9 16:32:40 2008 Subject: Unhelpful library documentation In-Reply-To: <20080809155101.GA13213@craft> References: <20080809155101.GA13213@craft> Message-ID: <489DFEFD.3010107@therning.org> Gwern Branwen wrote: > I don't know if everyone has seen this, so I thought I'd mention this: > > >> "Haskell API docs suck. A lot." > > "Haskell API documentation is very lacking for newbies. For instance, > I want to understand how to create and use regexes. If you start at > Text.Regex.Posix documentation, it tells you that =~ and =~~ are the > high level API, and the hyperlinks for those functions go to > Text.Regex.Posix.Wrap, where the main functions are not actually > documented at all!" > > "Since Haskell libraries are almost always implemented by Haskell > gurus, and they implement them with themselves in mind (I have no > objection to this, they are enthusiasts working for free), they use > lots of clever code and advanced Haskell techniques. But this means > that if you want people to actually use these libraries (and by > consequence Haskell itself), the documentation for Haskell libraries > has to be about an order of magnitude better than anything you'd find > anywhere else. I suspect it is at least an order of magnitude worse > than for something like .NET APIs, which means that relatively > speaking the documentation of Haskell is currently in an absolutely > dire state." > > > These are all interesting points, but I found most interesting the > conclusion: > > "Moving forward, I guess one problem is contributing to a library's > documentation. There is nothing on the API doc pages that shows you > how to do this. I suspect you need to check out the source with darcs > (not something I do normally, I just use cabal) and then start email > patches or something. Even then, I don't know if I would contribute > any documentation -- 'howto' style documentation seems out of place on > the API pages, but it is desperately needed." > > So, what can be done here? Offhand, I want to suggest adding a link to > some sort of tutorial module or wiki page. Each page *does* mention > that 'Maintainer libraries@haskell.org', but this really isn't > helpful; it doesn't tell you where to get the original library source > code, how to edit, what libraries@ *is* (anyone with a brain in their > head is going to know that libraries@ is some sort of group interface, > and is going to refrain from emailing it until they know that they > aren't cluelessly spamming/offending potentially hundreds of > Haskellers), and so on. > > Even better would be a link in the Description to the source: > 'Control.Concurrent.QSemN is a module in the concurrent package; you > can 'darcs get' it from http://foo.... For contributing, please module > Contributing/wiki page haskell.org/wiki/Contributing_to_libraries', etc. > > Thoughts? I think the haskell.org wiki would be a good place to document how to use APIs. In my opinion the Haddock-generated documents are better kept purely as reference documentation with a pointer to the entry-point on the wiki for the library in question. I suppose something similar should be encouraged for libraries on Hackage as well. Here's what I'll do with the only library I have on there (dataenc): 1. Set the 'homepage' cabal property to point to the wiki page I created on the wiki to describe the library (http://www.haskell.org/haskellwiki/Library/Data_encoding). 2. Add a link in the Haddock comments of every module to point to the same page. /M -- Magnus Therning (OpenPGP: 0xAB4DFBA4) magnus?therning?org Jabber: magnus?therning?org http://therning.org/magnus Haskell is an even 'redder' pill than Lisp or Scheme. -- PaulPotts -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: OpenPGP digital signature Url : http://www.haskell.org/pipermail/libraries/attachments/20080809/ec0915a9/signature.bin From gwern0 at gmail.com Sun Aug 10 15:46:49 2008 From: gwern0 at gmail.com (Gwern Branwen) Date: Sun Aug 10 15:47:31 2008 Subject: Unhelpful library documentation In-Reply-To: <489DFEFD.3010107@therning.org> References: <20080809155101.GA13213@craft> <489DFEFD.3010107@therning.org> Message-ID: <20080810194649.GB2585@craft> On 2008.08.09 21:33:01 +0100, Magnus Therning scribbled 4.2K characters: ... > I think the haskell.org wiki would be a good place to document how to > use APIs. In my opinion the Haddock-generated documents are better kept > purely as reference documentation with a pointer to the entry-point on > the wiki for the library in question. That seems like one of the best choices to me as well. Thoughts from everyone else? Suggestions about how to actually do it? (Do we add a link to the introduction, hack haddock output in some way to add links at the bottom perhaps, or just use the usual headers - is 'homepage' one of the possible fields?) > I suppose something similar should be encouraged for libraries on > Hackage as well. Here's what I'll do with the only library I have on > there (dataenc): > > 1. Set the 'homepage' cabal property to point to the wiki page I > created on the wiki to describe the library > (http://www.haskell.org/haskellwiki/Library/Data_encoding). > 2. Add a link in the Haddock comments of every module to point to the > same page. > > /M A useful example; post a link to the Haddock when you do it, would you? -- gwern Goodwin Bunny Blowpipe Montenegro Bechtel SLIP SAAM NSA Ufologico Comverse -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://www.haskell.org/pipermail/libraries/attachments/20080810/8777cc02/attachment-0001.bin From simonpj at microsoft.com Mon Aug 11 03:25:11 2008 From: simonpj at microsoft.com (Simon Peyton-Jones) Date: Mon Aug 11 03:24:40 2008 Subject: [nightly] 10-Aug-2008 build of HEAD on i386-unknown-linux (cam-02-unx.europe.corp.microsoft.com) In-Reply-To: <20080810200107.B5A303241BE@www.haskell.org> References: <20080810200107.B5A303241BE@www.haskell.org> Message-ID: <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> | -fgenerics -Wall -fno-warn-deprecated-flags -c Data/Time/Calendar/Gregorian.hs -o | dist/build/Data/Time/Calendar/Gregorian.o -ohi dist/build/Data/Time/Calendar/Gregorian.hi | | Data/Time/Calendar/Gregorian.hs:73:9: | Warning: orphan instance: instance Show Day | | : | Failing due to -Werror. | | gmake[2]: *** [dist/build/Data/Time/Calendar/Gregorian.o] Error 1 | gmake[2]: Leaving directory `/playpen/simonmar/nightly/HEAD/i386-unknown-linux/libraries/time' | gmake[1]: *** [make.library.time] Error 2 | gmake[1]: Leaving directory `/playpen/simonmar/nightly/HEAD/i386-unknown-linux/libraries' | gmake: *** [stage1] Error 2 Now that orphan warnings are "proper warnings" as Duncan requested, and hence do the right thing with -Werror, someone should either remove this orphan (best), by moving the instance to the module that defines Day, or add -fno-warn-orphans to this module. Who is responsible for the time/ library? There may be other libraries similarly affected. Simon From simonpj at microsoft.com Mon Aug 11 04:25:06 2008 From: simonpj at microsoft.com (Simon Peyton-Jones) Date: Mon Aug 11 04:24:33 2008 Subject: Unhelpful library documentation In-Reply-To: <489DFEFD.3010107@therning.org> References: <20080809155101.GA13213@craft> <489DFEFD.3010107@therning.org> Message-ID: <638ABD0A29C8884A91BC5FB5C349B1C32AE8065F57@EA-EXMSG-C334.europe.corp.microsoft.com> Lowering the barrier to entry for people to contribute to library documentation would be a Very Good Thing. There are lots of intelligent and motivated people out there! Fortunately, we have the Haskell wiki. Magnus's comments look spot-on to me. | I think the haskell.org wiki would be a good place to document how to | use APIs. In my opinion the Haddock-generated documents are better kept | purely as reference documentation with a pointer to the entry-point on | the wiki for the library in question. We could encourage every library author to: * Establish a page on the Haskell Wiki for the library (http://www.haskell.org/haskellwiki/Library/Data_encoding). * Set the 'homepage' cabal property to point to that wiki page (or, if the home page is elsewhere, that home page can point to the wiki) * Add a link in the Haddock comments of every module to point to the same page. * Make it clear that users are encouraged to write and improve the wiki documentation Perhaps such a practice should be explicitly encouraged in the guidelines for submitting a package to Hackage? Duncan, Don: perhaps these are ideas you could develop for the Haskell Platform? Simon | > These are all interesting points, but I found most interesting the | > conclusion: | > | > "Moving forward, I guess one problem is contributing to a library's | > documentation. There is nothing on the API doc pages that shows you | > how to do this. I suspect you need to check out the source with darcs | > (not something I do normally, I just use cabal) and then start email | > patches or something. Even then, I don't know if I would contribute | > any documentation -- 'howto' style documentation seems out of place on | > the API pages, but it is desperately needed." | > | > So, what can be done here? Offhand, I want to suggest adding a link to | > some sort of tutorial module or wiki page. Each page *does* mention | > that 'Maintainer libraries@haskell.org', but this really isn't | > helpful; it doesn't tell you where to get the original library source | > code, how to edit, what libraries@ *is* (anyone with a brain in their | > head is going to know that libraries@ is some sort of group interface, | > and is going to refrain from emailing it until they know that they | > aren't cluelessly spamming/offending potentially hundreds of | > Haskellers), and so on. | > | > Even better would be a link in the Description to the source: | > 'Control.Concurrent.QSemN is a module in the concurrent package; you | > can 'darcs get' it from http://foo.... For contributing, please module | > Contributing/wiki page haskell.org/wiki/Contributing_to_libraries', etc. | > | > Thoughts? | | I think the haskell.org wiki would be a good place to document how to | use APIs. In my opinion the Haddock-generated documents are better kept | purely as reference documentation with a pointer to the entry-point on | the wiki for the library in question. | | I suppose something similar should be encouraged for libraries on | Hackage as well. Here's what I'll do with the only library I have on | there (dataenc): | | 1. Set the 'homepage' cabal property to point to the wiki page I | created on the wiki to describe the library | (http://www.haskell.org/haskellwiki/Library/Data_encoding). | 2. Add a link in the Haddock comments of every module to point to the | same page. | | /M | | -- | Magnus Therning (OpenPGP: 0xAB4DFBA4) | magnus?therning?org Jabber: magnus?therning?org | http://therning.org/magnus From duncan.coutts at worc.ox.ac.uk Mon Aug 11 07:16:08 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Mon Aug 11 07:14:51 2008 Subject: [nightly] 10-Aug-2008 build of HEAD on i386-unknown-linux (cam-02-unx.europe.corp.microsoft.com) In-Reply-To: <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> Message-ID: <1218453368.7661.376.camel@localhost> On Mon, 2008-08-11 at 08:25 +0100, Simon Peyton-Jones wrote: > | Data/Time/Calendar/Gregorian.hs:73:9: > | Warning: orphan instance: instance Show Day > | > | : > | Failing due to -Werror. > | > | gmake[2]: *** [dist/build/Data/Time/Calendar/Gregorian.o] Error 1 > | gmake[2]: Leaving directory `/playpen/simonmar/nightly/HEAD/i386-unknown-linux/libraries/time' > | gmake[1]: *** [make.library.time] Error 2 > | gmake[1]: Leaving directory `/playpen/simonmar/nightly/HEAD/i386-unknown-linux/libraries' > | gmake: *** [stage1] Error 2 > > Now that orphan warnings are "proper warnings" as Duncan requested, > and hence do the right thing with -Werror, Thank you :-) > someone should either remove this orphan (best), by moving the > instance to the module that defines Day, or add -fno-warn-orphans to > this module. > Who is responsible for the time/ library? Author: Ashley Yakeley Maintainer: however... > There may be other libraries similarly affected. I think we should not build the non-core libs with -Werror. It makes perfect sense for the core libs where the ghc team effectively maintains them, but not for non-core ones. It is for exactly this reason that hackage rejects packages that specify "ghc-options: -Werror"; new compiler warnings make old packages fail to compile. So there should not be many libraries affected (there are only one or two on hackage that use -Werror before we added the check to reject it). So the packages themselves don't specify -Werror. I assume it's just ghc's build system adds it for all libs, core and other. Duncan From simonpj at microsoft.com Mon Aug 11 07:27:04 2008 From: simonpj at microsoft.com (Simon Peyton-Jones) Date: Mon Aug 11 07:26:32 2008 Subject: [nightly] 10-Aug-2008 build of HEAD on i386-unknown-linux (cam-02-unx.europe.corp.microsoft.com) In-Reply-To: <1218453368.7661.376.camel@localhost> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <1218453368.7661.376.camel@localhost> Message-ID: <638ABD0A29C8884A91BC5FB5C349B1C32AE80660C8@EA-EXMSG-C334.europe.corp.microsoft.com> | I think we should not build the non-core libs with -Werror. It makes | perfect sense for the core libs where the ghc team effectively maintains | them, but not for non-core ones. But time *is* a core lib. Similarly containers, pretty, filepath, directory... The full list is below. GHC is simply a client for these libraries. Should they have -Werror or not? I'm not sure. Simon | | It is for exactly this reason that hackage rejects packages that specify | "ghc-options: -Werror"; new compiler warnings make old packages fail to | compile. So there should not be many libraries affected (there are only | one or two on hackage that use -Werror before we added the check to | reject it). So the packages themselves don't specify -Werror. I assume | it's just ghc's build system adds it for all libs, core and other. | | Duncan utils/hsc2hs hsc2hs libraries/array packages/array libraries/base packages/base libraries/bytestring packages/bytestring libraries/Cabal packages/Cabal libraries/containers packages/containers libraries/directory packages/directory libraries/editline packages/editline libraries/filepath packages/filepath libraries/ghc-prim packages/ghc-prim libraries/haskell98 packages/haskell98 libraries/hpc packages/hpc libraries/integer-gmp packages/integer-gmp libraries/old-locale packages/old-locale libraries/old-time packages/old-time libraries/packedstring packages/packedstring libraries/pretty packages/pretty libraries/process packages/process libraries/random packages/random libraries/template-haskell packages/template-haskell libraries/unix packages/unix libraries/Win32 packages/Win32 From waldmann at imn.htwk-leipzig.de Mon Aug 11 07:54:00 2008 From: waldmann at imn.htwk-leipzig.de (Johannes Waldmann) Date: Mon Aug 11 07:53:20 2008 Subject: cabal-install Message-ID: <48A02858.3000503@imn.htwk-leipzig.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 "cabal install foo" seems a nice (in fact, indispensible) idea, but I don't see how to do the following with *one* command, for package "foo" and all its dependencies: * download and build as user, but install as root * also build and install haddockumentation for each of the packages Best regards, J.W. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4-svn0 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFIoChEDqiTJ5Q4dm8RAlM3AJwPSNROTgsheAZksP3I9WrRVD/xBwCgiI0S Tpmo/YRT4J3kEq9HjkTOzqY= =YWYl -----END PGP SIGNATURE----- From duncan.coutts at worc.ox.ac.uk Mon Aug 11 08:07:46 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Mon Aug 11 08:06:21 2008 Subject: [nightly] 10-Aug-2008 build of HEAD on i386-unknown-linux (cam-02-unx.europe.corp.microsoft.com) In-Reply-To: <638ABD0A29C8884A91BC5FB5C349B1C32AE80660C8@EA-EXMSG-C334.europe.corp.microsoft.com> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <1218453368.7661.376.camel@localhost> <638ABD0A29C8884A91BC5FB5C349B1C32AE80660C8@EA-EXMSG-C334.europe.corp.microsoft.com> Message-ID: <1218456466.7661.386.camel@localhost> On Mon, 2008-08-11 at 12:27 +0100, Simon Peyton-Jones wrote: > | I think we should not build the non-core libs with -Werror. It makes > | perfect sense for the core libs where the ghc team effectively maintains > | them, but not for non-core ones. > > But time *is* a core lib. Similarly containers, pretty, filepath, > directory... The full list is below. That list indicates that "old-time" is a core lib but "time" is not. > GHC is simply a client for these libraries. Should they have -Werror > or not? I'm not sure. I'm not sure either for those core libs that have external maintainers like filepath etc, but for non-core like "time" it'd be much easier for you without -Werror. > utils/hsc2hs hsc2hs > libraries/array packages/array > libraries/base packages/base > libraries/bytestring packages/bytestring > libraries/Cabal packages/Cabal > libraries/containers packages/containers > libraries/directory packages/directory > libraries/editline packages/editline > libraries/filepath packages/filepath > libraries/ghc-prim packages/ghc-prim > libraries/haskell98 packages/haskell98 > libraries/hpc packages/hpc > libraries/integer-gmp packages/integer-gmp > libraries/old-locale packages/old-locale > libraries/old-time packages/old-time but no "time" > libraries/packedstring packages/packedstring > libraries/pretty packages/pretty > libraries/process packages/process > libraries/random packages/random > libraries/template-haskell packages/template-haskell > libraries/unix packages/unix > libraries/Win32 packages/Win32 Duncan From duncan.coutts at worc.ox.ac.uk Mon Aug 11 08:17:17 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Mon Aug 11 08:15:53 2008 Subject: cabal-install In-Reply-To: <48A02858.3000503@imn.htwk-leipzig.de> References: <48A02858.3000503@imn.htwk-leipzig.de> Message-ID: <1218457037.7661.393.camel@localhost> On Mon, 2008-08-11 at 13:54 +0200, Johannes Waldmann wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > "cabal install foo" seems a nice (in fact, indispensible) idea, > but I don't see how to do the following with *one* command, > for package "foo" and all its dependencies: > * download and build as user, but install as root cabal install foo --global --root-cmd=sudo That feature has not been heavily tested so let us know if you find any problems. > * also build and install haddockumentation for each of the packages The current development version supports: cabal install foo --enable-documentation If you want any of these options on by default then you can ?(or will be able to) set them in the ~/.cabal/config file. Duncan From claus.reinke at talk21.com Mon Aug 11 11:46:31 2008 From: claus.reinke at talk21.com (Claus Reinke) Date: Mon Aug 11 11:45:59 2008 Subject: [nightly] 10-Aug-2008 build of HEAD on i386-unknown-linux(cam-02-unx.europe.corp.microsoft.com) References: <20080810200107.B5A303241BE@www.haskell.org><638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com><1218453368.7661.376.camel@localhost> <638ABD0A29C8884A91BC5FB5C349B1C32AE80660C8@EA-EXMSG-C334.europe.corp.microsoft.com> Message-ID: <011101c8fbc9$6ec4f920$33357ad5@cr3lt> | I think we should not build the non-core libs with -Werror. It makes | perfect sense for the core libs where the ghc team effectively maintains | them, but not for non-core ones. >But time *is* a core lib. Similarly containers, pretty, filepath, directory... The full list is >below. >GHC is simply a client for these libraries. Should they have -Werror or not? I'm not sure. The first time I joined a project with rcs (and in those days, that meant RCS), I was given a few firm rules, the most important of which was (emphasis added): 1. whatever I check in, the _whole_ thing has to build ok An immediate corollary was: 2. if _my_ changes break someone else's code, _I_ have to fix that In these days of distributed rcs with one build using multiple repos or even multiple rcss, those rules aren't as clear, but I'd suggest to interpret GHC+corelibs as a unit, and to apply rules 1 and 2. In fact, until the Haskell Platform is ready to take over from extralibs, it would be helpful to apply the same rules to extralibs as well. Otherwise, we'll end up here ? It highlights an important advantage of having a nontrivial set ? of extralibs in the ghc buildbot: early warnings about when and ? how ghc changes are going to break user-/library-code. ? ? Once the extralibs go, that feedback will be less immediate, the ? spread of breakage will be wider, and the current "why should ? Ghc Hq have to worry about network package maintenance?" ? could easily turn into a major source of friction, with both Ghc ? Hq and library maintainers insisting that the breakage isn't in their ? boat ("but HEAD fast builds just fine","but I didn't change a bit ? in my library code") and users likely to pay the price. much faster than even I feared just a week ago.. Claus ps. so that you don't have to iterate over failures: http://www.haskell.org/pipermail/cvs-ghc/2008-August/044021.html From claus.reinke at talk21.com Mon Aug 11 18:25:11 2008 From: claus.reinke at talk21.com (Claus Reinke) Date: Mon Aug 11 18:24:37 2008 Subject: [nightly] 10-Aug-2008 build of HEAD oni386-unknown-linux(cam-02-unx.europe.corp.microsoft.com) References: <20080810200107.B5A303241BE@www.haskell.org><638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com><1218453368.7661.376.camel@localhost><638ABD0A29C8884A91BC5FB5C349B1C32AE80660C8@EA-EXMSG-C334.europe.corp.microsoft.com><011101c8fbc9$6ec4f920$33357ad5@cr3lt> Message-ID: <017801c8fc01$1f93b5c0$33357ad5@cr3lt> >> 1. whatever I check in, the _whole_ thing has to build ok >> 2. if _my_ changes break someone else's code, _I_ have to fix that >> >> I'd suggest to interpret GHC+corelibs as a unit, and to apply rules >> 1 and 2. I was thinking of a unit of concern, not of any ownership or packaging (even the name "core"libs is ghc-specific: libraries ghc builds depends on). > But I do wish to point out that you cannot achieve both goals: "make > GHC and its dependencies into a single unit" and "share the libraries > with other compilers". At least, not without accepting some extra > work on the GHC side to maintain both illusions. Usually, the first time I hear about breakage affecting nhc98 or hugs is when you or Ross submit patches fixing said breakage!-) I think you are right: if the libraries are considered to be shared, then there are several such "units" which might affect each other via those shared libraries. Perhaps it helps to spell out the dependencies. In terms of changes in one place potentially having unwanted effects elsewhere, we have ghc <-> corelibs, ghc -> extralibs hugs -> corelibs+extralibs, ??hugslibs -> hugs nhc98 -> corelibs+extralibs, ??nhclibs -> nhc98 Actually, there are several forms of dependency: - changing the libraries cannot break hugs itself, but can make it unable to load or use any libraries, including the shared ones - changing (ghc-)corelibs can break ghc - changing nhc98-corelibs (which are?) can break nhc98 - changing hugs/nhc98/ghc can render each unable to load use some of the shared libraries (and others depending on these) Does that cover all cases? Changing ghc to make a library uncompilable by ghc only hurts ghc itself, but if the fix involves changing the library, the breakage might spread to hugs and nhc98. In principle, any changes to the shared libraries ought to be tested against all implementations sharing them. But as long as hugs and nhc98 are not part of a shared buildbot or validate system (neither of which are quite perfect even limited to ghc), breakage won't even be flagged for, let alone be fixed by submitters. While it is great that the two of you are always on the ball, is that something to be expected from all library maintainers all the time? Do all of them have to be subscribed to all of cvs-ghc, cvs-hugs, cvs-nhc98, and cvs-libraries, and fix any and all breakage arising in any and all situations, without their doing? Making it the submitters responsibility to look out for and fix any breakage caused by their patches simplifies the system and helps to ensure that any breakage resulting from changes is actually fixable (also in libraries outside the core+extralibs responsibility). In the time example, if ghc changes a warning flag to break the build, the immediate "fix" is to disable that new functionality, leaving it to the library maintainers to think about what the "proper" fix might be, possibly changing their code and re-enabling or leaving things as they are. Since those policies worked well in all projects where I have seen them used, I never really questioned them after I got my introduction. Is there a reason why they shouldn't or couldn't apply to the shared libraries and sharing haskell implementations? As I have mentioned, I'm worried that the move from extralibs to haskell platform will severe that responsibility, so library maintainers will have to become much more watchful for breakage affecting their packages, and more active in fixing such breakage. Claus From claus.reinke at talk21.com Mon Aug 11 18:36:24 2008 From: claus.reinke at talk21.com (Claus Reinke) Date: Mon Aug 11 18:35:50 2008 Subject: [nightly] 10-Aug-2008 build of HEADoni386-unknown-linux(cam-02-unx.europe.corp.microsoft.com) References: <20080810200107.B5A303241BE@www.haskell.org><638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com><1218453368.7661.376.camel@localhost><638ABD0A29C8884A91BC5FB5C349B1C32AE80660C8@EA-EXMSG-C334.europe.corp.microsoft.com><011101c8fbc9$6ec4f920$33357ad5@cr3lt> <017801c8fc01$1f93b5c0$33357ad5@cr3lt> Message-ID: <019e01c8fc02$b0983b80$33357ad5@cr3lt> > ghc <-> corelibs, ghc -> extralibs > hugs -> corelibs+extralibs, ??hugslibs -> hugs > nhc98 -> corelibs+extralibs, ??nhclibs -> nhc98 > > Actually, there are several forms of dependency: > > - changing the libraries cannot break hugs itself, but can make it > unable to load or use any libraries, including the shared ones > - changing (ghc-)corelibs can break ghc > - changing nhc98-corelibs (which are?) can break nhc98 > - changing hugs/nhc98/ghc can render each unable to load use > some of the shared libraries (and others depending on these) - changing any library can render any of hugs/nhc98/ghc unable to load/use it (and others depending on it) Claus From igloo at earth.li Mon Aug 11 19:33:55 2008 From: igloo at earth.li (Ian Lynagh) Date: Mon Aug 11 19:33:18 2008 Subject: [nightly] 10-Aug-2008 build of HEAD on i386-unknown-linux (cam-02-unx.europe.corp.microsoft.com) In-Reply-To: <1218453368.7661.376.camel@localhost> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <1218453368.7661.376.camel@localhost> Message-ID: <20080811233355.GA8744@matrix.chaos.earth.li> On Mon, Aug 11, 2008 at 12:16:08PM +0100, Duncan Coutts wrote: > > > Who is responsible for the time/ library? > > I think we should not build the non-core libs with -Werror. We don't, but: $ head -1 time/Data/Time/Calendar/Gregorian.hs {-# OPTIONS -Wall -Werror #-} (in actual fact, we don't even build GHC/bootlibs with -Werror except when validating). Thanks Ian From duncan.coutts at worc.ox.ac.uk Mon Aug 11 22:38:26 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Mon Aug 11 22:36:57 2008 Subject: [nightly] 10-Aug-2008 build of HEAD on i386-unknown-linux (cam-02-unx.europe.corp.microsoft.com) In-Reply-To: <20080811233355.GA8744@matrix.chaos.earth.li> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <1218453368.7661.376.camel@localhost> <20080811233355.GA8744@matrix.chaos.earth.li> Message-ID: <1218508706.7661.417.camel@localhost> On Tue, 2008-08-12 at 00:33 +0100, Ian Lynagh wrote: > On Mon, Aug 11, 2008 at 12:16:08PM +0100, Duncan Coutts wrote: > > > > > Who is responsible for the time/ library? > > > > I think we should not build the non-core libs with -Werror. > > We don't, but: > > $ head -1 time/Data/Time/Calendar/Gregorian.hs > {-# OPTIONS -Wall -Werror #-} > > (in actual fact, we don't even build GHC/bootlibs with -Werror except > when validating). Ah, so that's the culprit. Duncan From kahl at cas.mcmaster.ca Mon Aug 11 23:48:24 2008 From: kahl at cas.mcmaster.ca (kahl@cas.mcmaster.ca) Date: Tue Aug 12 00:44:30 2008 Subject: ``Orphan instances'' can be good. In-Reply-To: <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> (message from Simon Peyton-Jones on Mon, 11 Aug 2008 08:25:11 +0100) References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> Message-ID: <20080812034824.1592.qmail@schroeder.cas.mcmaster.ca> Simon Peyton Jones wrote: > | Data/Time/Calendar/Gregorian.hs:73:9: > | Warning: orphan instance: instance Show Day > | [...] > > Now that orphan warnings are "proper warnings" as Duncan requested, and > hence do the right thing with -Werror, someone should either remove this > orphan (best), by moving the instance to the module that defines Day, I just would like to point out that there is nothing inherently bad about what GHC calls ``orphan instances''. From a code structuring point of view, I frequently ``consider orphan'' instances useful for separation of concerns. Just consider a simple, prelude-based example: Read instances tend to pull in dependencies (e.g. Parsec) that a new datatype as such does not need, and the new datatyps's Read instance is also not needed everywhere where the type is needed itself. So I frequently create MyDatatypeRead modules with explicit, empty export lists, to export only the (``orphan'') instance. Orphan warnings are only an implementation-specific hint about an implementation-specific problem --- checking the GHC user manual again, I find that ``GHC tries to be clever'', and ``orphan instances'' are documented as only a situation that prolongs compile time. > or add -fno-warn-orphans to this module. I have no problem with this (of course I would consider using a OPTIONS_GHC pragma the preferable way); I just would like to emphasise that there is no implementation-independent reason to avoid ``orphan instances''. (On the implementation side, a completely different solution would be to add (automatically) re-exported instances (and rewrite rules) to the export lists stored inside .hi files --- then ``orphan instances'' would be no worse than other instances.) Wolfram From ashley at semantic.org Tue Aug 12 00:55:57 2008 From: ashley at semantic.org (Ashley Yakeley) Date: Tue Aug 12 00:55:27 2008 Subject: Orphan Instances In-Reply-To: <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> Message-ID: <48A117DD.8090108@semantic.org> Simon Peyton-Jones wrote: > Who is responsible for the time/ library? I am. > Now that orphan warnings are "proper warnings" as Duncan requested, What is an orphan instance, and why do we care about them? Since they weren't proper warnings before, I always assumed they were some weird GHC thing and not any kind of a concern with the code. But apparently not? -- Ashley From ashley at semantic.org Tue Aug 12 01:12:32 2008 From: ashley at semantic.org (Ashley Yakeley) Date: Tue Aug 12 01:11:57 2008 Subject: -Wall -Werror In-Reply-To: <1218508706.7661.417.camel@localhost> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <1218453368.7661.376.camel@localhost> <20080811233355.GA8744@matrix.chaos.earth.li> <1218508706.7661.417.camel@localhost> Message-ID: <48A11BC0.4040909@semantic.org> Duncan Coutts wrote: >> $ head -1 time/Data/Time/Calendar/Gregorian.hs >> {-# OPTIONS -Wall -Werror #-} >> >> (in actual fact, we don't even build GHC/bootlibs with -Werror except >> when validating). > > Ah, so that's the culprit. I prefer this, actually, as we get to discover issues sooner rather than later. Though putting it in the .cabal file or wherever might be better. It makes sense for Hackage to reject packages that use -Wall -Werror, but for that I use a Makefile that calls cabal passing them in. -- Ashley Yakeley From lemming at henning-thielemann.de Tue Aug 12 02:55:51 2008 From: lemming at henning-thielemann.de (Henning Thielemann) Date: Tue Aug 12 02:55:14 2008 Subject: ``Orphan instances'' should be avoided anyway. In-Reply-To: <20080812034824.1592.qmail@schroeder.cas.mcmaster.ca> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <20080812034824.1592.qmail@schroeder.cas.mcmaster.ca> Message-ID: On Tue, 12 Aug 2008 kahl@cas.mcmaster.ca wrote: > I just would like to point out that there is nothing inherently bad about > what GHC calls ``orphan instances''. > > From a code structuring point of view, > I frequently ``consider orphan'' instances > useful for separation of concerns. The problem is, that if you have a main instance of a class for a type and this one is not bundled with either the type or the class, then you are able to import the type and the class without the main instance (that is, you can accidentally miss that instance), and thus you are able to define another instance. This will likely cause clash with the main instance sooner or later, if other modules import your custom instance and the main one. From simonpj at microsoft.com Tue Aug 12 03:11:41 2008 From: simonpj at microsoft.com (Simon Peyton-Jones) Date: Tue Aug 12 03:11:06 2008 Subject: -Wall -Werror In-Reply-To: <48A11BC0.4040909@semantic.org> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <1218453368.7661.376.camel@localhost> <20080811233355.GA8744@matrix.chaos.earth.li> <1218508706.7661.417.camel@localhost> <48A11BC0.4040909@semantic.org> Message-ID: <638ABD0A29C8884A91BC5FB5C349B1C32AE806633F@EA-EXMSG-C334.europe.corp.microsoft.com> Ashley | Duncan Coutts wrote: | >> $ head -1 time/Data/Time/Calendar/Gregorian.hs | >> {-# OPTIONS -Wall -Werror #-} | >> | >> (in actual fact, we don't even build GHC/bootlibs with -Werror except | >> when validating). | > | > Ah, so that's the culprit. | | I prefer this, actually, as we get to discover issues sooner rather than | later. Though putting it in the .cabal file or wherever might be better. That's fine. But then you can choose a) add -fno-warn-orphans or b) move the (Show Day) instance to the module declaring Day. I'm not sure which is best for you, but you're the package author so you get to decide! Regardless, it'd help if you felt able to do one or the other, because currently the package won't compile at all. Simon From simonpj at microsoft.com Tue Aug 12 03:20:27 2008 From: simonpj at microsoft.com (Simon Peyton-Jones) Date: Tue Aug 12 03:19:49 2008 Subject: Orphan Instances In-Reply-To: <48A117DD.8090108@semantic.org> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <48A117DD.8090108@semantic.org> Message-ID: <638ABD0A29C8884A91BC5FB5C349B1C32AE8066342@EA-EXMSG-C334.europe.corp.microsoft.com> | Ashley: | What is an orphan instance, and why do we care about them? They are documented in the GHC manual http://www.haskell.org/ghc/docs/latest/html/users_guide/separate-compilation.html#orphan-modules | Wolfram: | I just would like to point out that there is nothing inherently bad about | what GHC calls ``orphan instances''. ... that there is no *implementation-independent* | reason to avoid ``orphan instances''. | | From a code structuring point of view, | I frequently ``consider orphan'' instances | useful for separation of concerns. I agree. The warning just warns you that compilation of any module that depends on this module, or on the package of which this module becomes a part, will become a little slower, for reasons explained above. | (On the implementation side, a completely different solution would be | to add (automatically) re-exported instances (and rewrite rules) | to the export lists stored inside .hi files --- | then ``orphan instances'' would be no worse than other instances.) Indeed, you could certainly accumulate in every M.hi file a list of all orphan instances anywhere below M. What GHC does instead is to accumulate a list of all the *modules that contain* orphan instances, which amounts to much the same thing. Either way it's tiresome because all these instances must be brought into scope for every compilation, even though most of them are useless. As you say, though, it's just an implementation matter. That's why it's only a warning. Simon From ashley at semantic.org Tue Aug 12 03:40:44 2008 From: ashley at semantic.org (Ashley Yakeley) Date: Tue Aug 12 03:40:06 2008 Subject: -Wall -Werror In-Reply-To: <638ABD0A29C8884A91BC5FB5C349B1C32AE806633F@EA-EXMSG-C334.europe.corp.microsoft.com> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <1218453368.7661.376.camel@localhost> <20080811233355.GA8744@matrix.chaos.earth.li> <1218508706.7661.417.camel@localhost> <48A11BC0.4040909@semantic.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE806633F@EA-EXMSG-C334.europe.corp.microsoft.com> Message-ID: <48A13E7C.8050704@semantic.org> Simon Peyton-Jones wrote: > That's fine. But then you can choose > > a) add -fno-warn-orphans > or > b) move the (Show Day) instance to the module declaring Day. > > I'm not sure which is best for you, but you're the package author so you get to decide! Patch pushed. I've plumped for option a. It's better structuring, as b would involve moving more code from Gregorian.hs to Days.hs. The concern Henning raised shouldn't apply, as both modules are hidden and re-exported by Data.Time.Calendar. I've also fixed two other modules with the same issue (again, also hidden). -- Ashley Yakeley From simonpj at microsoft.com Tue Aug 12 03:49:51 2008 From: simonpj at microsoft.com (Simon Peyton-Jones) Date: Tue Aug 12 03:49:13 2008 Subject: -Wall -Werror In-Reply-To: <48A13E7C.8050704@semantic.org> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <1218453368.7661.376.camel@localhost> <20080811233355.GA8744@matrix.chaos.earth.li> <1218508706.7661.417.camel@localhost> <48A11BC0.4040909@semantic.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE806633F@EA-EXMSG-C334.europe.corp.microsoft.com> <48A13E7C.8050704@semantic.org> Message-ID: <638ABD0A29C8884A91BC5FB5C349B1C32AE8066363@EA-EXMSG-C334.europe.corp.microsoft.com> | Patch pushed. I've plumped for option a. It's better structuring, as b | would involve moving more code from Gregorian.hs to Days.hs. | | The concern Henning raised shouldn't apply, as both modules are hidden | and re-exported by Data.Time.Calendar. I've also fixed two other modules | with the same issue (again, also hidden). Thanks. It doesn't matter whether they are hidden or not --- their instances are visible regardless in Haskell. So those interface files will be read any time you compile a module that depends on a module in the time package. But that's not a terribly big deal. Simon From ashley at semantic.org Tue Aug 12 03:54:04 2008 From: ashley at semantic.org (Ashley Yakeley) Date: Tue Aug 12 03:53:27 2008 Subject: -Wall -Werror In-Reply-To: <638ABD0A29C8884A91BC5FB5C349B1C32AE8066363@EA-EXMSG-C334.europe.corp.microsoft.com> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <1218453368.7661.376.camel@localhost> <20080811233355.GA8744@matrix.chaos.earth.li> <1218508706.7661.417.camel@localhost> <48A11BC0.4040909@semantic.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE806633F@EA-EXMSG-C334.europe.corp.microsoft.com> <48A13E7C.8050704@semantic.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8066363@EA-EXMSG-C334.europe.corp.microsoft.com> Message-ID: <48A1419C.60108@semantic.org> Simon Peyton-Jones wrote: > | Patch pushed. I've plumped for option a. It's better structuring, as b > | would involve moving more code from Gregorian.hs to Days.hs. > | > | The concern Henning raised shouldn't apply, as both modules are hidden > | and re-exported by Data.Time.Calendar. I've also fixed two other modules > | with the same issue (again, also hidden). > > Thanks. It doesn't matter whether they are hidden or not --- their instances are visible regardless in Haskell. So those interface files will be read any time you compile a module that depends on a module in the time package. But that's not a terribly big deal. What I meant was, the module in which the instance is defined, and the module in which the type is defined are both hidden, and only re-exported by another module. Thus it is not possible to import the type without importing the instance. I believe this is the concern that Henning Thielemann raised. -- Ashley Yakeley From claus.reinke at talk21.com Tue Aug 12 05:30:10 2008 From: claus.reinke at talk21.com (Claus Reinke) Date: Tue Aug 12 05:29:35 2008 Subject: -Wall -Werror References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <1218453368.7661.376.camel@localhost> <20080811233355.GA8744@matrix.chaos.earth.li> <1218508706.7661.417.camel@localhost><48A11BC0.4040909@semantic.org><638ABD0A29C8884A91BC5FB5C349B1C32AE806633F@EA-EXMSG-C334.europe.corp.microsoft.com> <48A13E7C.8050704@semantic.org> Message-ID: <00a401c8fc5e$05560cf0$4c298351@cr3lt> >> I'm not sure which is best for you, but you're the package author so you get to decide! While you're looking at the code for time, Ashley, I've got a question about its relation to old-time: shouldn't System.Time have a DEPRECATED pragma, pointing to time? The comments and package name say old-time is deprecated in favour of time. Claus From ashley at semantic.org Tue Aug 12 05:39:26 2008 From: ashley at semantic.org (Ashley Yakeley) Date: Tue Aug 12 05:38:48 2008 Subject: -Wall -Werror In-Reply-To: <00a401c8fc5e$05560cf0$4c298351@cr3lt> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <1218453368.7661.376.camel@localhost> <20080811233355.GA8744@matrix.chaos.earth.li> <1218508706.7661.417.camel@localhost><48A11BC0.4040909@semantic.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE806633F@EA-EXMSG-C334.europe.corp.microsoft.com> <48A13E7C.8050704@semantic.org> <00a401c8fc5e$05560cf0$4c298351@cr3lt> Message-ID: <1218533966.9627.1.camel@glastonbury> On Tue, 2008-08-12 at 10:30 +0100, Claus Reinke wrote: > >> I'm not sure which is best for you, but you're the package author so you get to decide! > > While you're looking at the code for time, Ashley, I've got a > question about its relation to old-time: shouldn't System.Time > have a DEPRECATED pragma, pointing to time? The comments > and package name say old-time is deprecated in favour of time. Probably. I've never touched old-time. -- Ashley Yakeley From ross at soi.city.ac.uk Tue Aug 12 08:19:13 2008 From: ross at soi.city.ac.uk (Ross Paterson) Date: Tue Aug 12 08:18:37 2008 Subject: -Wall -Werror In-Reply-To: <48A1419C.60108@semantic.org> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <1218453368.7661.376.camel@localhost> <20080811233355.GA8744@matrix.chaos.earth.li> <1218508706.7661.417.camel@localhost> <48A11BC0.4040909@semantic.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE806633F@EA-EXMSG-C334.europe.corp.microsoft.com> <48A13E7C.8050704@semantic.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8066363@EA-EXMSG-C334.europe.corp.microsoft.com> <48A1419C.60108@semantic.org> Message-ID: <20080812121913.GA6462@soi.city.ac.uk> On Tue, Aug 12, 2008 at 12:54:04AM -0700, Ashley Yakeley wrote: > What I meant was, the module in which the instance is defined, and > the module in which the type is defined are both hidden, and only > re-exported by another module. Thus it is not possible to import the > type without importing the instance. Not quite: * Data.Time.Calendar exports Day without its Read instance. * Data.Time.Clock exports UTCTime without Read or Show instances. * Data.Time.LocalTime exports TimeOfDay, LocalTime, TimeZone, UTCTime and ZonedTime without Read instances. From marlowsd at gmail.com Tue Aug 12 10:34:01 2008 From: marlowsd at gmail.com (Simon Marlow) Date: Tue Aug 12 10:33:25 2008 Subject: -Wall -Werror In-Reply-To: <00a401c8fc5e$05560cf0$4c298351@cr3lt> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <1218453368.7661.376.camel@localhost> <20080811233355.GA8744@matrix.chaos.earth.li> <1218508706.7661.417.camel@localhost><48A11BC0.4040909@semantic.org><638ABD0A29C8884A91BC5FB5C349B1C32AE806633F@EA-EXMSG-C334.europe.corp.microsoft.com> <48A13E7C.8050704@semantic.org> <00a401c8fc5e$05560cf0$4c298351@cr3lt> Message-ID: <48A19F59.2070706@gmail.com> Claus Reinke wrote: >>> I'm not sure which is best for you, but you're the package author so >>> you get to decide! > > While you're looking at the code for time, Ashley, I've got a question > about its relation to old-time: shouldn't System.Time have a DEPRECATED > pragma, pointing to time? The comments and package name say old-time is > deprecated in favour of time. I looked into this; it's not quite that simple. System.Time exports ClockTime, which is still used in System.Directory.getModificationTime. So in order to properly deprecate System.Time, we have to supply an alternative to System.Directory.getModificationTime, which would introduce a dependency on the time package, and directory is currently a core package. Cheers, Simon From claus.reinke at talk21.com Tue Aug 12 11:06:10 2008 From: claus.reinke at talk21.com (Claus Reinke) Date: Tue Aug 12 11:05:35 2008 Subject: -Wall -Werror References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <1218453368.7661.376.camel@localhost> <20080811233355.GA8744@matrix.chaos.earth.li> <1218508706.7661.417.camel@localhost><48A11BC0.4040909@semantic.org><638ABD0A29C8884A91BC5FB5C349B1C32AE806633F@EA-EXMSG-C334.europe.corp.microsoft.com> <48A13E7C.8050704@semantic.org> <00a401c8fc5e$05560cf0$4c298351@cr3lt> <48A19F59.2070706@gmail.com> Message-ID: <020801c8fc8c$f5e00760$4c298351@cr3lt> >> While you're looking at the code for time, Ashley, I've got a question >> about its relation to old-time: shouldn't System.Time have a DEPRECATED >> pragma, pointing to time? The comments and package name say old-time is >> deprecated in favour of time. > > I looked into this; it's not quite that simple. System.Time exports > ClockTime, which is still used in System.Directory.getModificationTime. So > in order to properly deprecate System.Time, we have to supply an > alternative to System.Directory.getModificationTime, which would introduce > a dependency on the time package, and directory is currently a core package. Thanks for checking, Simon. But wouldn't that simply mean replacing old-time with time in the corelibs, keeping old-time around for one or two releases only to get the deprecation message out? Perhaps time could even provide a compat module for the transition period, so that old-time could be dropped immediately, while current old-time clients transition from the compat module to proper time modules. Claus From marlowsd at gmail.com Tue Aug 12 11:43:55 2008 From: marlowsd at gmail.com (Simon Marlow) Date: Tue Aug 12 11:43:22 2008 Subject: -Wall -Werror In-Reply-To: <020801c8fc8c$f5e00760$4c298351@cr3lt> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <1218453368.7661.376.camel@localhost> <20080811233355.GA8744@matrix.chaos.earth.li> <1218508706.7661.417.camel@localhost><48A11BC0.4040909@semantic.org><638ABD0A29C8884A91BC5FB5C349B1C32AE806633F@EA-EXMSG-C334.europe.corp.microsoft.com> <48A13E7C.8050704@semantic.org> <00a401c8fc5e$05560cf0$4c298351@cr3lt> <48A19F59.2070706@gmail.com> <020801c8fc8c$f5e00760$4c298351@cr3lt> Message-ID: <48A1AFBB.1090406@gmail.com> Claus Reinke wrote: >>> While you're looking at the code for time, Ashley, I've got a >>> question about its relation to old-time: shouldn't System.Time have a >>> DEPRECATED pragma, pointing to time? The comments and package name >>> say old-time is deprecated in favour of time. >> >> I looked into this; it's not quite that simple. System.Time exports >> ClockTime, which is still used in >> System.Directory.getModificationTime. So in order to properly >> deprecate System.Time, we have to supply an alternative to >> System.Directory.getModificationTime, which would introduce a >> dependency on the time package, and directory is currently a core >> package. > > Thanks for checking, Simon. But wouldn't that simply mean > replacing old-time with time in the corelibs, keeping old-time around > for one or two releases only to get the deprecation message out? Perhaps > time could even provide a compat > module for the transition period, so that old-time could be > dropped immediately, while current old-time clients transition > from the compat module to proper time modules. I don't think it's straightforward to implement System.Time in terms of Data.Time, so we really have to bring in time. Also, we have to replace System.Directory.getModificationTime (I suppose it should return UTCTime?), and hence we'll need a compat version of directory... or call the new function something different. Cheers, Simon From kahl at cas.mcmaster.ca Tue Aug 12 13:38:28 2008 From: kahl at cas.mcmaster.ca (kahl@cas.mcmaster.ca) Date: Tue Aug 12 13:38:23 2008 Subject: ``Orphan instances'' should be avoided anyway. In-Reply-To: (message from Henning Thielemann on Tue, 12 Aug 2008 08:55:51 +0200 (MEST)) References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <20080812034824.1592.qmail@schroeder.cas.mcmaster.ca> Message-ID: <20080812173828.3155.qmail@schroeder.cas.mcmaster.ca> Henning Thielemann wrote: > > On Tue, 12 Aug 2008 kahl@cas.mcmaster.ca wrote: > > > I just would like to point out that there is nothing inherently bad about > > what GHC calls ``orphan instances''. > > > > From a code structuring point of view, > > I frequently ``consider orphan'' instances > > useful for separation of concerns. > > The problem is, that if you have a main instance of a class for a type and > this one is not bundled with either the type or the class, then you are > able to import the type and the class without the main instance (that is, > you can accidentally miss that instance) Or on purpose --- this is in fact another use of ``orphan instances'' I forgot to mention. > , and thus you are able to define another instance. Indeed --- this is the only way to have different instances for the same class, as long as we do not have something like the ``named instances'' of our Haskell-2001 paper (shameless plug ;-). > This will likely cause clash with the main instance > sooner or later, if other modules import your custom instance and the main > one. If there are several instances, there is very likely no ``main instance''. Wolfram From jonathanccast at fastmail.fm Tue Aug 12 13:42:56 2008 From: jonathanccast at fastmail.fm (Jonathan Cast) Date: Tue Aug 12 13:44:34 2008 Subject: ``Orphan instances'' should be avoided anyway. In-Reply-To: <20080812173828.3155.qmail@schroeder.cas.mcmaster.ca> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <20080812034824.1592.qmail@schroeder.cas.mcmaster.ca> <20080812173828.3155.qmail@schroeder.cas.mcmaster.ca> Message-ID: <1218562976.1523.20.camel@jcchost> On Tue, 2008-08-12 at 17:38 +0000, kahl@cas.mcmaster.ca wrote: > Henning Thielemann wrote: > > > > On Tue, 12 Aug 2008 kahl@cas.mcmaster.ca wrote: > > > > > I just would like to point out that there is nothing inherently bad about > > > what GHC calls ``orphan instances''. > > > > > > From a code structuring point of view, > > > I frequently ``consider orphan'' instances > > > useful for separation of concerns. > > > > The problem is, that if you have a main instance of a class for a type and > > this one is not bundled with either the type or the class, then you are > > able to import the type and the class without the main instance (that is, > > you can accidentally miss that instance) > > Or on purpose --- this is in fact another use of ``orphan instances'' > I forgot to mention. > > > , and thus you are able to define another instance. > > Indeed --- this is the only way to have different instances > for the same class, as long as we do not have something like > the ``named instances'' of our Haskell-2001 paper (shameless plug ;-). > > > This will likely cause clash with the main instance > > sooner or later, if other modules import your custom instance and the main > > one. > > If there are several instances, > there is very likely no ``main instance''. If there is no main instance, there should very likely be no instance at all. We already have named instances: data ShowDict alpha = ShowDict alpha { namedShows :: alpha -> String -> String } show :: ?namedShow :: ShowDict alpha => alpha -> String show x = namedShows ?namedShow x "" Confusing this with type classes seems mostly redundant to me. jcc From gale at sefer.org Tue Aug 12 18:06:26 2008 From: gale at sefer.org (Yitzchak Gale) Date: Tue Aug 12 18:05:46 2008 Subject: ``Orphan instances'' should be avoided anyway. In-Reply-To: <1218562976.1523.20.camel@jcchost> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <20080812034824.1592.qmail@schroeder.cas.mcmaster.ca> <20080812173828.3155.qmail@schroeder.cas.mcmaster.ca> <1218562976.1523.20.camel@jcchost> Message-ID: <2608b8a80808121506j4e2bd234l5fa4a99bf551466c@mail.gmail.com> Wolfram wrote: >> Or on purpose --- this is in fact another use of ``orphan instances'' >> I forgot to mention... >> Indeed --- this is the only way to have different instances >> for the same class, as long as we do not have something like >> the ``named instances'' of our Haskell-2001 paper (shameless plug ;-). Henning Thielemann wrote: >>> This will likely cause clash with the main instance >>> sooner or later, if other modules import your custom instance and the main >>> one. >> If there are several instances, >> there is very likely no ``main instance''. Jonathan Cast wrote: > If there is no main instance, there should very likely be no instance at > all. We already have named instances... > Confusing this with type classes seems mostly redundant to me. This argument, or something like it, is raised whenever someone mentions the need to define multiple instances of a class for the same type. And it is correct, theoretically. But in real life, you often need to write code against existing modules that you can't change. When an existing module exports an instance that is inconvenient, you can be in deep trouble. We are desperately in need of a solution to this problem. If not Wolfram's "named instances", then at least there must be some way to control the import and export of instances, just as we can now control the import and export of every other kind of symbol binding. Thanks, Yitz From duncan.coutts at worc.ox.ac.uk Tue Aug 12 20:04:53 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Tue Aug 12 20:16:18 2008 Subject: PROPOSAL: More base package breakup In-Reply-To: <20080806203700.GA18316@matrix.chaos.earth.li> References: <20080806203700.GA18316@matrix.chaos.earth.li> Message-ID: <1218585893.7661.472.camel@localhost> On Wed, 2008-08-06 at 21:37 +0100, Ian Lynagh wrote: > Hi all, > > This is trac #1338: > http://hackage.haskell.org/trac/ghc/ticket/1338#comment:14 > http://hackage.haskell.org/trac/ghc/attachment/ticket/1338/packagegraph.png > > Initial deadline: 21 Aug (2 weeks). Generally this looks good if a little on the ultra-fine-grained side. If we can fold back a few of those tiny single-module packages that'd probably be good. I think we should not move the Applicative class however. It belongs in the same package as Functor and Monad (and we should be encouraging everything to be an instance of ?Applicative). Similarly, I think the Monoid class should stay in base. I don't care so much about the extra Monoid types currently defined in Data.Monoid but the class should stay. I don't think it makes any sense to put Monoid and Applicative in the containers package. They're almost completely unrelated. ?One reason is that classes (but not their instances) are common interfaces and so belong further down in the package dep graph. People tend to try to reduce package dependencies so making people depend on containers to give an applicative instance will mean that in many cases people simply will not. I'll let other people comment on Data.Foldable and .Traversable since I don't use them. Duncan From lemming at henning-thielemann.de Wed Aug 13 03:07:48 2008 From: lemming at henning-thielemann.de (Henning Thielemann) Date: Wed Aug 13 03:07:07 2008 Subject: ``Orphan instances'' should be avoided anyway. In-Reply-To: <2608b8a80808121506j4e2bd234l5fa4a99bf551466c@mail.gmail.com> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <20080812034824.1592.qmail@schroeder.cas.mcmaster.ca> <20080812173828.3155.qmail@schroeder.cas.mcmaster.ca> <1218562976.1523.20.camel@jcchost> <2608b8a80808121506j4e2bd234l5fa4a99bf551466c@mail.gmail.com> Message-ID: On Wed, 13 Aug 2008, Yitzchak Gale wrote: > Jonathan Cast wrote: >> If there is no main instance, there should very likely be no instance at >> all. We already have named instances... >> Confusing this with type classes seems mostly redundant to me. > > This argument, or something like it, is raised whenever someone > mentions the need to define multiple instances of a class for the > same type. And it is correct, theoretically. > > But in real life, you often need to write code against existing modules > that you can't change. When an existing module exports an instance > that is inconvenient, you can be in deep trouble. > > We are desperately in need of a solution to this problem. If not > Wolfram's "named instances", then at least there must be some > way to control the import and export of instances, just as we can > now control the import and export of every other kind of symbol > binding. For me it's most often the case that an instance is missing. If there is no way to change existing instance definitions, then you must use 'newtype'. Generalized newtype deriving simplifies to adapt the instances you want. From lemming at henning-thielemann.de Wed Aug 13 03:15:01 2008 From: lemming at henning-thielemann.de (Henning Thielemann) Date: Wed Aug 13 03:14:20 2008 Subject: PROPOSAL: More base package breakup In-Reply-To: <1218585893.7661.472.camel@localhost> References: <20080806203700.GA18316@matrix.chaos.earth.li> <1218585893.7661.472.camel@localhost> Message-ID: On Wed, 13 Aug 2008, Duncan Coutts wrote: > On Wed, 2008-08-06 at 21:37 +0100, Ian Lynagh wrote: >> Hi all, >> >> This is trac #1338: >> http://hackage.haskell.org/trac/ghc/ticket/1338#comment:14 >> http://hackage.haskell.org/trac/ghc/attachment/ticket/1338/packagegraph.png >> >> Initial deadline: 21 Aug (2 weeks). > > Generally this looks good if a little on the ultra-fine-grained side. If > we can fold back a few of those tiny single-module packages that'd > probably be good. > > I think we should not move the Applicative class however. It belongs in > the same package as Functor and Monad (and we should be encouraging > everything to be an instance of ?Applicative). +1 > Similarly, I think the Monoid class should stay in base. I don't care so > much about the extra Monoid types currently defined in Data.Monoid but > the class should stay. > > I don't think it makes any sense to put Monoid and Applicative in the > containers package. They're almost completely unrelated. +1 > ?One reason is that classes (but not their instances) are common > interfaces and so belong further down in the package dep graph. People > tend to try to reduce package dependencies so making people depend on > containers to give an applicative instance will mean that in many cases > people simply will not. +1 > I'll let other people comment on Data.Foldable and .Traversable since I > don't use them. For me Foldable and Traversable belong to where Applicative is. Boring comment, isn't it? From wnoise at ofb.net Wed Aug 13 04:07:33 2008 From: wnoise at ofb.net (Aaron Denney) Date: Wed Aug 13 04:07:01 2008 Subject: ``Orphan instances'' should be avoided anyway. References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <20080812034824.1592.qmail@schroeder.cas.mcmaster.ca> <20080812173828.3155.qmail@schroeder.cas.mcmaster.ca> <1218562976.1523.20.camel@jcchost> <2608b8a80808121506j4e2bd234l5fa4a99bf551466c@mail.gmail.com> Message-ID: On 2008-08-12, Yitzchak Gale wrote: > Wolfram wrote: >>> Or on purpose --- this is in fact another use of ``orphan instances'' >>> I forgot to mention... >>> Indeed --- this is the only way to have different instances >>> for the same class, as long as we do not have something like >>> the ``named instances'' of our Haskell-2001 paper (shameless plug ;-). > > Henning Thielemann wrote: >>>> This will likely cause clash with the main instance >>>> sooner or later, if other modules import your custom instance and the main >>>> one. > >>> If there are several instances, >>> there is very likely no ``main instance''. > > Jonathan Cast wrote: >> If there is no main instance, there should very likely be no instance at >> all. We already have named instances... >> Confusing this with type classes seems mostly redundant to me. > > This argument, or something like it, is raised whenever someone > mentions the need to define multiple instances of a class for the > same type. And it is correct, theoretically. > > But in real life, you often need to write code against existing modules > that you can't change. When an existing module exports an instance > that is inconvenient, you can be in deep trouble. Or, you can just use newtype. -- Aaron Denney -><- From malcolm.wallace at cs.york.ac.uk Wed Aug 13 04:29:03 2008 From: malcolm.wallace at cs.york.ac.uk (Malcolm Wallace) Date: Wed Aug 13 04:28:22 2008 Subject: ``Orphan instances'' should be avoided anyway. In-Reply-To: <1218562976.1523.20.camel@jcchost> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <20080812034824.1592.qmail@schroeder.cas.mcmaster.ca> <20080812173828.3155.qmail@schroeder.cas.mcmaster.ca> <1218562976.1523.20.camel@jcchost> Message-ID: > We already have named instances: > > data ShowDict alpha = ShowDict alpha { > namedShows :: alpha -> String -> String > } This is not valid Haskell. Either all components of the data structure are named fields, or none. It is not possible to have a mixture of the two, as here. In addition, there are many classes it is not currently possible to simulate using this technique, because the types go significantly beyond Haskell'98. I believe you need both Rank-2 Types and Polymorphic Components to achieve comparable expressivity. Fuller details on the Haskell-Prime site: http://hackage.haskell.org/trac/haskell-prime/wiki/PolymorphicComponents Regards, Malcolm From Christian.Maeder at dfki.de Wed Aug 13 07:09:36 2008 From: Christian.Maeder at dfki.de (Christian Maeder) Date: Wed Aug 13 07:08:53 2008 Subject: Orphan Instances In-Reply-To: <638ABD0A29C8884A91BC5FB5C349B1C32AE8066342@EA-EXMSG-C334.europe.corp.microsoft.com> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <48A117DD.8090108@semantic.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8066342@EA-EXMSG-C334.europe.corp.microsoft.com> Message-ID: <48A2C0F0.7000408@dfki.de> Simon Peyton-Jones wrote: > | Ashley: > | What is an orphan instance, and why do we care about them? > > They are documented in the GHC manual > http://www.haskell.org/ghc/docs/latest/html/users_guide/separate-compilation.html#orphan-modules GHC identifies orphan modules, and visits the interface file of every orphan module below the module being compiled. This is usually wasted work, but there is no avoiding it. You should therefore do your best to have as few orphan modules as possible. [...] > I agree. The warning just warns you that compilation of any module that depends on this module, or on the package of which this module becomes a part, will become a little slower, for reasons explained above. [...] > Indeed, you could certainly accumulate in every M.hi file a list of all orphan instances anywhere below M. What GHC does instead is to accumulate a list of all the *modules that contain* orphan instances, which amounts to much the same thing. Either way it's tiresome because all these instances must be brought into scope for every compilation, even though most of them are useless. I still don't understand why ghc will become "slower" with orphaned modules. Where is "wasted work" or which "instances are useless"? Doesn't ghc just read all interface files of modules in the import chain (i.e. all modules "below")? Or is that the "disaster in practice, so GHC tries to be clever"? In what way is GHC clever? Are only interface files of directly imported modules (plus orphaned modules mentioned in there) read in? Is there a difference if I compile each file individually or if I use "ghc --make"? If I compile a single module M that does not need instances from an orphaned module, this orphaned module wouldn't be in the import chain and therefore I would expect the compilation of M to be faster. (Here I assume that I've only orphaned instances in orphaned modules.) Conversely, If all my instances are not orphaned I'll always have instances in scope that I may not need in some importing module. Could someone enlighten me? Cheers Christian From benjovi at gmx.net Wed Aug 13 09:45:50 2008 From: benjovi at gmx.net (Benedikt Huber) Date: Wed Aug 13 09:45:09 2008 Subject: ANN: Initial release of Language.C (language-c-0.3) Message-ID: <4FD16766-82FF-4FF8-9E71-DEEE6D7F62AE@gmx.net> Hi all, I'm pleased to announce the first release of Language.C, a library for analysing and generating C code. This release features * A reasonably well tested parser handling and recording all of C99 and most GNU extensions, most notably gcc's attribute syntax. * A pretty printer generating source code from the AST, covering the same language subset as the parser. * A preview of the analysis framework, including functionality for dissecting C's cruel type and variable declarations. Places: * The project's homepage is located at http://www.sivity.net/projects/language.c (Getting Started, Bug Tracker) * The package is available via hackage (language-c-0.3) * darcs repo: http://code.haskell.org/language-c * API docs: http://code.haskell.org/~bhuber/docs/language-c-latest/ The library originated from the C-related code in c2hs (http://www.cse.unsw.edu.au/~chak/haskell/c2hs/ ), and is the topic of my SoC project, mentored by Iavor Diatchki. (Iavor, Don Steward and Duncan Coutts provided great support, thank you) Feedback and suggestions in any form are most welcome, especially because there is a large range of features which could be implemented. The current status and a few ideas for the next releases are summarized at http://www.sivity.net/projects/language.c/wiki/ ProjectPlan. best regards, benedikt From simonpj at microsoft.com Wed Aug 13 09:51:33 2008 From: simonpj at microsoft.com (Simon Peyton-Jones) Date: Wed Aug 13 09:50:52 2008 Subject: Orphan Instances In-Reply-To: <48A2C0F0.7000408@dfki.de> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <48A117DD.8090108@semantic.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8066342@EA-EXMSG-C334.europe.corp.microsoft.com> <48A2C0F0.7000408@dfki.de> Message-ID: <638ABD0A29C8884A91BC5FB5C349B1C32AE8066AE8@EA-EXMSG-C334.europe.corp.microsoft.com> | I still don't understand why ghc will become "slower" with orphaned | modules. Where is "wasted work" or which "instances are useless"? | Doesn't ghc just read all interface files of modules in the import chain | (i.e. all modules "below")? Or is that the "disaster in practice, so GHC | tries to be clever"? In what way is GHC clever? Are only interface files | of directly imported modules (plus orphaned modules mentioned in there) | read in? In the *absence* of orphan modules, GHC reads as few interface files as possible. It must read the interface of every *directly-imported* module. After that, it's by-need only. For example module Foo where import Prelude x = () GHC must read Prelude.hi, but needs read nothing else to compile the module. Now suppose it's like this instead module Foo where import Prelude x = map module Prelude( map, filter ) where import GHC.Map( map ) import GHC.Filter( filter ) Now when compiling Foo, GHC reads Prelude.hi, and sees that GHC.Map.map is brought into scope. Since that function is *used* in Foo, GHC also reads GHC.Map.hi to find GHC.Map.map's type, unfolding, arity, strictness etc etc. But it doesn't read GHC.Filter. In the *presence* of orphan modules, perhaps somewhere in the transitive closure of modules imported by Prelude, GHC must read those interface files too. We store a list of all orphan modules transitively below Prelude inside Prelude.hi, precisely so GHC knows which ones to read. Does that help? (If so, and you find it helpful, would you like to add some advice or information to the GHC wiki? So that those not reading this thread right now might be illuminated later.) Simon From Christian.Maeder at dfki.de Wed Aug 13 11:36:33 2008 From: Christian.Maeder at dfki.de (Christian Maeder) Date: Wed Aug 13 11:35:49 2008 Subject: Orphan Instances In-Reply-To: <638ABD0A29C8884A91BC5FB5C349B1C32AE8066AE8@EA-EXMSG-C334.europe.corp.microsoft.com> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <48A117DD.8090108@semantic.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8066342@EA-EXMSG-C334.europe.corp.microsoft.com> <48A2C0F0.7000408@dfki.de> <638ABD0A29C8884A91BC5FB5C349B1C32AE8066AE8@EA-EXMSG-C334.europe.corp.microsoft.com> Message-ID: <48A2FF81.8090704@dfki.de> Simon Peyton-Jones wrote: > In the *absence* of orphan modules, GHC reads as few interface files as possible. It must read the interface of every *directly-imported* module. After that, it's by-need only. For example > module Foo where > import Prelude > x = () > GHC must read Prelude.hi, but needs read nothing else to compile the module. > > Now suppose it's like this instead > module Foo where > import Prelude > x = map > > module Prelude( map, filter ) where > import GHC.Map( map ) > import GHC.Filter( filter ) > > Now when compiling Foo, GHC reads Prelude.hi, and sees that GHC.Map.map is brought into scope. Since that function is *used* in Foo, GHC also reads GHC.Map.hi to find GHC.Map.map's type, unfolding, arity, strictness etc etc. But it doesn't read GHC.Filter. > > In the *presence* of orphan modules, perhaps somewhere in the transitive closure of modules imported by Prelude, GHC must read those interface files too. We store a list of all orphan modules transitively below Prelude inside Prelude.hi, precisely so GHC knows which ones to read. > > Does that help? (If so, and you find it helpful, would you like to add some advice or information to the GHC wiki? So that those not reading this thread right now might be illuminated later.) Thanks for this explanation. I'ld rather like if someone else took my ignorance as feedback for improving the documentation, though. From your description I would conclude that the overhead wrt. orphaned instances comes from "wrapper" modules that basically reexport other modules (including orphaned modules). Or am I mistaken here, because instances are always reexported? Is there a difference for the situation: 1. module A data T instance C T and 2. module A (module TA) import TA import IA module IA import TA instance C T module TA data T except reading 3 instead of 1 interface files, as happens if I would split up an other module? Maybe it is not worth discussing about an overhead (or a disadvantage of orphaned modules) at all, since it is obviously faster if I only import a data type from a library without certain instances when I don't need these instances. Cheers Christian From jonathanccast at fastmail.fm Wed Aug 13 12:08:36 2008 From: jonathanccast at fastmail.fm (Jonathan Cast) Date: Wed Aug 13 12:10:15 2008 Subject: ``Orphan instances'' should be avoided anyway. In-Reply-To: References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <20080812034824.1592.qmail@schroeder.cas.mcmaster.ca> <20080812173828.3155.qmail@schroeder.cas.mcmaster.ca> <1218562976.1523.20.camel@jcchost> Message-ID: <1218643716.1523.31.camel@jcchost> On Wed, 2008-08-13 at 09:29 +0100, Malcolm Wallace wrote: > > We already have named instances: > > > > data ShowDict alpha = ShowDict alpha { > > namedShows :: alpha -> String -> String > > } > > This is not valid Haskell. I must have been distracted. I meant: data ShowDict alpha = ShowDict { namedShows :: alpha -> String -> String } obviously. > Either all components of the data > structure are named fields, or none. It is not possible to have a > mixture of the two, as here. > > In addition, there are many classes it is not currently possible to > simulate using this technique, because the types go significantly > beyond Haskell'98. My proposal already required implicit parameters... jcc From dons at galois.com Wed Aug 13 16:02:30 2008 From: dons at galois.com (Don Stewart) Date: Wed Aug 13 16:02:00 2008 Subject: ANN: Initial release of Language.C (language-c-0.3) In-Reply-To: <4FD16766-82FF-4FF8-9E71-DEEE6D7F62AE@gmx.net> References: <4FD16766-82FF-4FF8-9E71-DEEE6D7F62AE@gmx.net> Message-ID: <20080813200230.GA6886@scytale.galois.com> I think the bug tracker/home page is down. Any ideas? benjovi: > Hi all, > > I'm pleased to announce the first release of Language.C, a library for > analysing and generating C code. > > This release features > > * A reasonably well tested parser handling and recording all of > C99 and most GNU extensions, most notably gcc's attribute syntax. > > * A pretty printer generating source code from the AST, covering > the same language subset as the parser. > > * A preview of the analysis framework, including functionality for > dissecting C's cruel type and variable declarations. > > Places: > > * The project's homepage is located at > http://www.sivity.net/projects/language.c (Getting Started, Bug Tracker) > > * The package is available via hackage (language-c-0.3) > > * darcs repo: http://code.haskell.org/language-c > > * API docs: http://code.haskell.org/~bhuber/docs/language-c-latest/ > > The library originated from the C-related code in c2hs > (http://www.cse.unsw.edu.au/~chak/haskell/c2hs/ ), and is the topic of my > SoC project, mentored by Iavor Diatchki. (Iavor, Don Steward and Duncan > Coutts provided great support, thank you) > > Feedback and suggestions in any form are most welcome, especially > because there is a large range of features which could be implemented. > The current status and a few ideas for the next releases are > summarized at http://www.sivity.net/projects/language.c/wiki/ > ProjectPlan. > > best regards, > benedikt > _______________________________________________ > Libraries mailing list > Libraries@haskell.org > http://www.haskell.org/mailman/listinfo/libraries From duncan.coutts at worc.ox.ac.uk Wed Aug 13 08:38:02 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Wed Aug 13 18:25:17 2008 Subject: Unhelpful library documentation In-Reply-To: <638ABD0A29C8884A91BC5FB5C349B1C32AE8065F57@EA-EXMSG-C334.europe.corp.microsoft.com> References: <20080809155101.GA13213@craft> <489DFEFD.3010107@therning.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065F57@EA-EXMSG-C334.europe.corp.microsoft.com> Message-ID: <1218631082.7661.541.camel@localhost> On Mon, 2008-08-11 at 09:25 +0100, Simon Peyton-Jones wrote: > Lowering the barrier to entry for people to contribute to library documentation would be a Very Good Thing. There are lots of intelligent and motivated people out there! > > Fortunately, we have the Haskell wiki. Magnus's comments look spot-on to me. > > | I think the haskell.org wiki would be a good place to document how to > | use APIs. In my opinion the Haddock-generated documents are better kept > | purely as reference documentation with a pointer to the entry-point on > | the wiki for the library in question. > > We could encourage every library author to: > * Establish a page on the Haskell Wiki for the library > (http://www.haskell.org/haskellwiki/Library/Data_encoding). > * Set the 'homepage' cabal property to point to that wiki page > (or, if the home page is elsewhere, that home page can point to > the wiki) > * Add a link in the Haddock comments of every module to point to the > same page. > * Make it clear that users are encouraged to write and improve > the wiki documentation > > Perhaps such a practice should be explicitly encouraged in the > guidelines for submitting a package to Hackage? > > Duncan, Don: perhaps these are ideas you could develop for the Haskell > Platform? Hmm. I think the 'homepage' property should point to the package's homepage. For many projects that may well be a wiki site, but bigger projects often have both a homepage and some user-editable wiki. So perhaps it wants another property for wiki links. Further, haddock can link each module or entity to a different uri: --comments-base=URL URL for a comments link on the contents and index pages --comments-module=URL URL for a comments link for each module (using the %{MODULE} var) --comments-entity=URL URL for a comments link for each entity (using the %{FILE}, %{MODULE}, %{NAME}, %{KIND} or %{LINE} vars) We should think about what convention we want there. Eg single page per package with #refs for modules, though bigger packages may want a wiki page per-module with #refs for each entity. This can mostly be encoded into the url itself with var references. The slightly annoying thing is that the urls for the base, module and entity are not necessarily derivable from each other, though in many common cases they are. Perhaps that just needs a convention (like dropping the tail of the url after %{MODULE} to get the base url) eg: wiki: ?http://www.haskell.org/haskellwiki/Library/Data_encoding/%{MODULE}#%{NAME} gives us: ?--comments-base=?http://www.haskell.org/haskellwiki/Library/Data_encoding/ --comments-module=?http://www.haskell.org/haskellwiki/Library/Data_encoding/%{MODULE} --comments-entity=??http://www.haskell.org/haskellwiki/Library/Data_encoding/%{MODULE}#%{NAME} That is, not only did we drop "?%{NAME}" to get the module one, we also dropped the "#". Another example: ?wiki: ?http://wiki.project-foo.org//?module=%{MODULE}#%{NAME} This works less well since we'd get: ?--comments-base=??http://wiki.project-foo.org/?module= --comments-module=?http://wiki.project-foo.org/?module=%{MODULE} --comments-entity=???http://wiki.project-foo.org/?module=%{MODULE}#%{NAME} but perhaps it's good enough. Duncan From haskell at list.mightyreason.com Thu Aug 14 06:10:41 2008 From: haskell at list.mightyreason.com (Chris Kuklewicz) Date: Thu Aug 14 09:39:03 2008 Subject: ANN: protocol-buffers (bootstrap works) version 0.1.0 In-Reply-To: <4877F538.9020300@list.mightyreason.com> References: <4877F538.9020300@list.mightyreason.com> Message-ID: <48A404A1.7060407@list.mightyreason.com> Announcing the Haskell version of protocol-buffers, version 0.1.0. This is still pre-beta. The darcs repository is at http://darcs.haskell.org/packages/protocol-buffers/ There is an (untested) tarball at http://hackage.haskell.org/cgi-bin/hackage-scripts/package/protocol-buffers The original google version of protocol buffers is at http://code.google.com/apis/protocolbuffers/docs/overview.html After tinkering for a while, the Haskell protocol-buffers package is now able to bootstrap the descriptor.proto file from the google source. The lexer & parser can handle the full unittest.proto from the google source. The Lexer.hs module is created from "Lexer.x" by the Alex program (I am using Alex version 2.2). The basics for the wire protocol are there, but have not been tested and need a small high level API. There is no actual program yet. The closest thing to an entry point is the "Text.ProtocolBuffers.Bootstrap" module which shows how I generated the modules from descriptor.proto. Notes on the current implementation: The messages become Haskell record data types with 1 constructor and an individual module for namespace management. The fields become Haskell record names, optional fields are wrapped in Maybe repeated fields are wrapped in Seq (from Data.Sequence) Enumerations become Haskell data types with an individual module for namespace management. The enum values become different Haskell constructors for the type (no arguments) Wire protocol implemented on top of binary Get and Put monads (to and from Lazy ByteString) Some reflection is possible (via type classes) Todo: Make the Lexer check that TYPE_STRING default values are valid UTF-8 (easy). The handling of default string/bytes in Gen.hs is not quite right and will need to be fixed (easy). The next things to add are full support for TYPE_GROUP and then support for extensions (easy and medium). Check that the Lexer properly handles /* block style */ comments. Test the wire protocol versus google's implementation. I have not even started looking at serivice/method/rpc support. Cheers, Chris Kuklewicz From benjovi at gmx.net Thu Aug 14 11:49:20 2008 From: benjovi at gmx.net (Benedikt Huber) Date: Thu Aug 14 11:48:42 2008 Subject: ANN: Initial release of Language.C (language-c-0.3) In-Reply-To: <20080813200230.GA6886@scytale.galois.com> References: <4FD16766-82FF-4FF8-9E71-DEEE6D7F62AE@gmx.net> <20080813200230.GA6886@scytale.galois.com> Message-ID: <48A45400.5020807@gmx.net> Don Stewart schrieb: > I think the bug tracker/home page is down. Any ideas? Language.C's project page went down for a few hours yesterday, sorry about that. After switching trac to use mod_python it should work faster and more reliable now. Btw, keigoi posted a very nice, small example (Language.C + Data.Generics) here: http://d.hatena.ne.jp/syd_syd/20080813#p1 (in japanese). -- benedikt > > benjovi: >> Hi all, >> >> I'm pleased to announce the first release of Language.C, a library for >> analysing and generating C code. >> >> This release features >> >> * A reasonably well tested parser handling and recording all of >> C99 and most GNU extensions, most notably gcc's attribute syntax. >> >> * A pretty printer generating source code from the AST, covering >> the same language subset as the parser. >> >> * A preview of the analysis framework, including functionality for >> dissecting C's cruel type and variable declarations. >> >> Places: >> >> * The project's homepage is located at >> http://www.sivity.net/projects/language.c (Getting Started, Bug Tracker) >> >> * The package is available via hackage (language-c-0.3) >> >> * darcs repo: http://code.haskell.org/language-c >> >> * API docs: http://code.haskell.org/~bhuber/docs/language-c-latest/ >> >> The library originated from the C-related code in c2hs >> (http://www.cse.unsw.edu.au/~chak/haskell/c2hs/ ), and is the topic of my >> SoC project, mentored by Iavor Diatchki. (Iavor, Don Steward and Duncan >> Coutts provided great support, thank you) >> >> Feedback and suggestions in any form are most welcome, especially >> because there is a large range of features which could be implemented. >> The current status and a few ideas for the next releases are >> summarized at http://www.sivity.net/projects/language.c/wiki/ >> ProjectPlan. >> >> best regards, >> benedikt >> _______________________________________________ >> Libraries mailing list >> Libraries@haskell.org >> http://www.haskell.org/mailman/listinfo/libraries From isaacdupree at charter.net Thu Aug 14 13:22:49 2008 From: isaacdupree at charter.net (Isaac Dupree) Date: Thu Aug 14 13:22:00 2008 Subject: Orphan Instances In-Reply-To: <638ABD0A29C8884A91BC5FB5C349B1C32AE8066AE8@EA-EXMSG-C334.europe.corp.microsoft.com> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <48A117DD.8090108@semantic.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8066342@EA-EXMSG-C334.europe.corp.microsoft.com> <48A2C0F0.7000408@dfki.de> <638ABD0A29C8884A91BC5FB5C349B1C32AE8066AE8@EA-EXMSG-C334.europe.corp.microsoft.com> Message-ID: <48A469E9.4080108@charter.net> Simon Peyton-Jones wrote: > Now when compiling Foo, GHC reads Prelude.hi, and sees that GHC.Map.map is brought into scope. Since that function is *used* in Foo, GHC also reads GHC.Map.hi to find GHC.Map.map's type, unfolding, arity, strictness etc etc. But it doesn't read GHC.Filter. > > In the *presence* of orphan modules, perhaps somewhere in the transitive closure of modules imported by Prelude, GHC must read those interface files too. We store a list of all orphan modules transitively below Prelude inside Prelude.hi, precisely so GHC knows which ones to read. Perhaps Prelude.hi could, instead of storing a *list* of orphan modules, store all the orphan instances themselves? Obviously that would waste a bit more space (less now that instances don't contain big function bodies in the HEAD; more because RULES can be orphan too, and numerous). Would it be any faster loading or is it just a bad idea? :-) -Isaac From Christian.Maeder at dfki.de Fri Aug 15 05:04:21 2008 From: Christian.Maeder at dfki.de (Christian Maeder) Date: Fri Aug 15 05:03:28 2008 Subject: Orphan Instances In-Reply-To: <48A469E9.4080108@charter.net> References: <20080810200107.B5A303241BE@www.haskell.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8065EE6@EA-EXMSG-C334.europe.corp.microsoft.com> <48A117DD.8090108@semantic.org> <638ABD0A29C8884A91BC5FB5C349B1C32AE8066342@EA-EXMSG-C334.europe.corp.microsoft.com> <48A2C0F0.7000408@dfki.de> <638ABD0A29C8884A91BC5FB5C349B1C32AE8066AE8@EA-EXMSG-C334.europe.corp.microsoft.com> <48A469E9.4080108@charter.net> Message-ID: <48A54695.5090301@dfki.de> Isaac Dupree wrote: > Perhaps Prelude.hi could, instead of storing a *list* of orphan modules, > store all the orphan instances themselves? Obviously that would waste a > bit more space (less now that instances don't contain big function > bodies in the HEAD; more because RULES can be orphan too, and > numerous). Would it be any faster loading or is it just a bad idea? :-) I don't see a performance issue, and if there is one a ticket should be created for it (since there a sometimes good reasons for orphaned modules -- and good reasons to avoid them, too!) Cheers Christian From ross at soi.city.ac.uk Fri Aug 15 08:44:05 2008 From: ross at soi.city.ac.uk (Ross Paterson) Date: Fri Aug 15 08:43:18 2008 Subject: proposal #2517: remove 'pure' method from Arrow class Message-ID: <20080815124405.GA7518@soi.city.ac.uk> The Arrow class as originally defined by John Hughes had methods arr, >>> and first. (>>> has since been moved by #1773 to the Category class.) When writing the "Fun of Programming" paper, I added pure as a synonym for arr, because Richard Bird preferred it. However this name hasn't caught on, and now it clashes with a method in the Applicative class, so I propose to remove it. The usual practice would be to deprecate the name in one release and remove it in the following one, but I propose to remove it in one step because * no-one seems to be using this name, and * backward compatibility has been broken anyway by the Category split (#1773). The only people who will be bitten by the change are those who import Control.Arrow hiding pure, and they wouldn't be warned by deprecation. From isaacdupree at charter.net Fri Aug 15 09:58:10 2008 From: isaacdupree at charter.net (Isaac Dupree) Date: Fri Aug 15 09:57:14 2008 Subject: proposal #2517: remove 'pure' method from Arrow class In-Reply-To: <20080815124405.GA7518@soi.city.ac.uk> References: <20080815124405.GA7518@soi.city.ac.uk> Message-ID: <48A58B72.3010509@charter.net> Ross Paterson wrote: > The Arrow class as originally defined by John Hughes had methods arr, >>> > and first. (>>> has since been moved by #1773 to the Category class.) > When writing the "Fun of Programming" paper, I added pure as a synonym > for arr, because Richard Bird preferred it. However this name hasn't > caught on, and now it clashes with a method in the Applicative class, > so I propose to remove it. I agree with deleting 'pure' from there. Now assuming that, I'm wondering... It looks like all Arrows are Applicative; is that a useful observation? (and then even further off topic, but it's still important); As follows: <*> :: (Applicative f) => f (a -> b) -> f a -> f b so <*> :: (Arrow arr) => arr x (a -> b) -> arr x a -> arr x b fmap :: (Arrow arr) => (a -> b) -> arr x a -> arr x b pure :: (Applicative f) => a -> f a pure :: (Arrow arr) => a -> arr x a Interesting, it looks a bit similar to 'Reader x', is that okay? instance (Arrow arr) => Applicative (arr x) where pure a = arr (const a) fmap f a = a >>> arr f f <*> a = fmap (uncurry ($)) (f &&& a) But I seem to recall arrConst indeed being of significance, for example. And the above definition of <*> preserves the ordering of f and a; for example, the effect-order-reversing Applicative-transformer could accurately be applied here. So I think it's nontrivially useful. Not all Applicatives are Arrows though! ArrowPlus, of course, corresponds with Alternative; the full types of the appending operation's arguments must be identical in all cases. instance (Arrow arr) => Alternative (arr x) where empty = zeroArrow; (<|>) = (<+>) and thenceforth with Monoid instance (Arrow arr) => Alternative (arr a b) where mempty = zeroArrow; mappend = (<+>) . It makes me wonder what is the point of ArrowPlus, MonadPlus, Alternative... when we have Monoid. But are they usable in all the same situations? It seems some places that use f :: (MonadPlus f) => ... would indeed need to require the polymorphic (invented syntax) f :: (forall a. Monoid (f a)) => ... not just for some particular 'a' f :: (Monoid (f a)) => ... Why does that make sense? Should it? Did I get confused somehow? -Isaac From dave at zednenem.com Fri Aug 15 12:55:07 2008 From: dave at zednenem.com (David Menendez) Date: Fri Aug 15 12:54:18 2008 Subject: proposal #2517: remove 'pure' method from Arrow class In-Reply-To: <20080815124405.GA7518@soi.city.ac.uk> References: <20080815124405.GA7518@soi.city.ac.uk> Message-ID: <49a77b7a0808150955k3b1634b4q5817eeafb0311227@mail.gmail.com> On Fri, Aug 15, 2008 at 8:44 AM, Ross Paterson wrote: > The Arrow class as originally defined by John Hughes had methods arr, >>> > and first. (>>> has since been moved by #1773 to the Category class.) > When writing the "Fun of Programming" paper, I added pure as a synonym > for arr, because Richard Bird preferred it. However this name hasn't > caught on, and now it clashes with a method in the Applicative class, > so I propose to remove it. I agree. Personally, I prefer 'pure' to 'arr' for arrows, and I wish we could have called the operation in Applicative something different, but it's too late for that. -- Dave Menendez From dave at zednenem.com Fri Aug 15 13:37:25 2008 From: dave at zednenem.com (David Menendez) Date: Fri Aug 15 13:36:36 2008 Subject: proposal #2517: remove 'pure' method from Arrow class In-Reply-To: <48A58B72.3010509@charter.net> References: <20080815124405.GA7518@soi.city.ac.uk> <48A58B72.3010509@charter.net> Message-ID: <49a77b7a0808151037y2d2a8046i26cee43830f60090@mail.gmail.com> On Fri, Aug 15, 2008 at 9:58 AM, Isaac Dupree wrote: > It looks like all Arrows are Applicative; is that a useful observation? The WrappedArrow type in Control.Applicative creates an applicative functor from any arrow. The paper "Idioms are oblivious, arrows are meticulous, monads are promiscuous" by Lindley, Wadler, and Yallop has a good explanation of the relationship between arrows and applicative functors. > ArrowPlus, of course, corresponds with Alternative; the full types of the > appending operation's arguments must be identical in all cases. > instance (Arrow arr) => Alternative (arr x) where > empty = zeroArrow; (<|>) = (<+>) > and thenceforth with Monoid > instance (Arrow arr) => Alternative (arr a b) where > mempty = zeroArrow; mappend = (<+>) > . It makes me wonder what is the point of ArrowPlus, MonadPlus, > Alternative... when we have Monoid. > > But are they usable in all the same situations? It seems some places that > use > f :: (MonadPlus f) => ... > would indeed need to require the polymorphic (invented syntax) > f :: (forall a. Monoid (f a)) => ... > not just for some particular 'a' > f :: (Monoid (f a)) => ... > > Why does that make sense? Should it? Did I get confused somehow? MonadPlus is more restrictive than Monoid in (at least) two ways. First, the instances of MonadPlus have kind * -> *, whereas the instances of Monoid have kind *. With Monoid, you can easily constrain a type constructor's parameter, e.g.: instance (Foo a) => Monoid (Bar a) where ... With MonadPlus, this is not possible. Second, instances of MonadPlus must obey additional laws governing their relationship to the Monad operations. In addition to mplus and mzero forming a monoid, mzero must[1] be a left zero for (>>=), mzero >>= f = mzero and (>>=) must[2] left-distribute over mplus, mplus a b >>= f = mplus (a >>= f) (b >>= f) The differences between Monoid, ArrowPlus, and Alternative are similar, although I don't recall seeing laws stated for Alternative. [1] Has anyone examined whether it's possible to violate this law while still satisfying the monad and monoid laws? [2] Not everyone agrees with this law. In fact, the instance for Maybe in Control.Monad doesn't satisfy it. Generally, any monad which uses mplus for exception handling instead of non-determinism will not satisfy this law. -- Dave Menendez From ashley at semantic.org Sat Aug 16 04:06:29 2008 From: ashley at semantic.org (Ashley Yakeley) Date: Sat Aug 16 04:05:37 2008 Subject: proposal #2517: remove 'pure' method from Arrow class In-Reply-To: <49a77b7a0808150955k3b1634b4q5817eeafb0311227@mail.gmail.com> References: <20080815124405.GA7518@soi.city.ac.uk> <49a77b7a0808150955k3b1634b4q5817eeafb0311227@mail.gmail.com> Message-ID: <48A68A85.7090207@semantic.org> David Menendez wrote: > Personally, I prefer 'pure' to 'arr' for arrows, and I wish we could > have called the operation in Applicative something different, but it's > too late for that. Hopefully one day the Applicative function will be called "return". -- Ashley From ross at soi.city.ac.uk Sat Aug 16 08:20:41 2008 From: ross at soi.city.ac.uk (Ross Paterson) Date: Sat Aug 16 08:19:51 2008 Subject: proposal #2517: remove 'pure' method from Arrow class In-Reply-To: <48A68A85.7090207@semantic.org> References: <20080815124405.GA7518@soi.city.ac.uk> <49a77b7a0808150955k3b1634b4q5817eeafb0311227@mail.gmail.com> <48A68A85.7090207@semantic.org> Message-ID: <20080816122041.GA7148@soi.city.ac.uk> On Sat, Aug 16, 2008 at 01:06:29AM -0700, Ashley Yakeley wrote: > David Menendez wrote: > >> Personally, I prefer 'pure' to 'arr' for arrows, and I wish we could >> have called the operation in Applicative something different, but it's >> too late for that. > > Hopefully one day the Applicative function will be called "return". Certainly Functor (fmap) PreMonad (return) Applicative ((<*>)) Monad ((>>=)/join) would be the rational hierarchy, but don't hold your breath. From bertram.felgenhauer at googlemail.com Sat Aug 16 09:31:36 2008 From: bertram.felgenhauer at googlemail.com (Bertram Felgenhauer) Date: Sat Aug 16 09:30:49 2008 Subject: darcs patch: Fix oversight in Control.OldException Message-ID: <1218893496.12804@zombie> Sat Aug 16 15:26:31 CEST 2008 Bertram Felgenhauer * Fix oversight in Control.OldException The NonTermination constructor slipped through in the Exception instance. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/x-darcs-patch Size: 16614 bytes Desc: A darcs patch for your repository! Url : http://www.haskell.org/pipermail/libraries/attachments/20080816/d0153aec/attachment-0001.bin From rwbarton at math.harvard.edu Sat Aug 16 10:51:14 2008 From: rwbarton at math.harvard.edu (Reid Barton) Date: Sat Aug 16 10:51:00 2008 Subject: laziness of intersperse Message-ID: (This is the same issue as http://www.haskell.org/pipermail/haskell/ 2004-March/013739.html but there was no follow-up at that time.) The intersperse library function is not as lazy as it could be. The current definition of intersperse is intersperse :: a -> [a] -> [a] intersperse _ [] = [] intersperse _ [x] = [x] intersperse sep (x:xs) = x : sep : intersperse sep xs For any list (x:xs) not containing _|_, intersperse sep (x:xs) is a list of the form (x:...); yet intersperse sep (x:_|_) = _|_ because the pattern match on the second equation diverges. A better definition would be intersperse _ [] = [] intersperse sep (x:xs) = x : intersperseWithPrefix xs where intersperseWithPrefix [] = [] intersperseWithPrefix (x:xs) = sep : x : intersperseWithPrefix xs (slightly modified from http://www.haskell.org/pipermail/haskell/2004- March/013741.html) An application: There was a question on #haskell about how to compute the "ruler" sequence [1,2,1,3,1,2,1,4,1,2,1,3,1,2,1,5,...]. The definition ruler = fix ((1:) . intersperse 1 . map (1+)) works with the properly lazy intersperse, but not with the intersperse in Data.List. Comments on this new definition? Can it get added to Data.List? Regards, Reid Barton From gwern0 at gmail.com Sat Aug 16 12:13:21 2008 From: gwern0 at gmail.com (Gwern Branwen) Date: Sat Aug 16 12:13:14 2008 Subject: laziness of intersperse In-Reply-To: References: Message-ID: <20080816161321.GA21525@craft> On 2008.08.16 10:51:14 -0400, Reid Barton scribbled 1.3K characters: > (This is the same issue as http://www.haskell.org/pipermail/haskell/ > 2004-March/013739.html but there was no follow-up at that time.) > > The intersperse library function is not as lazy as it could be. The > current definition of intersperse is > > intersperse :: a -> [a] -> [a] > intersperse _ [] = [] > intersperse _ [x] = [x] > intersperse sep (x:xs) = x : sep : intersperse sep xs > > For any list (x:xs) not containing _|_, intersperse sep (x:xs) is a list > of the form (x:...); yet intersperse sep (x:_|_) = _|_ because the > pattern match on the second equation diverges. A better definition would > be > > intersperse _ [] = [] > intersperse sep (x:xs) = x : intersperseWithPrefix xs > where intersperseWithPrefix [] = [] > intersperseWithPrefix (x:xs) = sep : x : intersperseWithPrefix xs > > (slightly modified from http://www.haskell.org/pipermail/haskell/2004- > March/013741.html) > > An application: There was a question on #haskell about how to compute > the "ruler" sequence [1,2,1,3,1,2,1,4,1,2,1,3,1,2,1,5,...]. The > definition > > ruler = fix ((1:) . intersperse 1 . map (1+)) > > works with the properly lazy intersperse, but not with the intersperse in > Data.List. > > Comments on this new definition? Can it get added to Data.List? > > Regards, > Reid Barton I assume you mean something like this (setting up for QC and removing some aliasing in intersperse') {-# LANGUAGE NoMonomorphismRestriction #-} import Test.QuickCheck (quickCheck) intersperse :: a -> [a] -> [a] intersperse _ [] = [] intersperse _ [x] = [x] intersperse sep (x:xs) = x : sep : intersperse sep xs intersperse' :: a -> [a] -> [a] intersperse' _ [] = [] intersperse' sep (x:xs) = x : intersperseWithPrefix xs where intersperseWithPrefix [] = [] intersperseWithPrefix (y:ys) = sep : y : intersperseWithPrefix ys prop = \x y -> intersperse x y == intersperse' x y ---- Well, I ran a few thousand QuickChecks. Looks good to me, although I'm not thrilled with the lost of clarity - intersperse' to me is less immediately understandable than intersperse. (Although I don't suppose it's really all that important.) -- gwern Nash CIO VIA SHAPE Fax 767 Middleman schloss ASDIC CIA-DST -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://www.haskell.org/pipermail/libraries/attachments/20080816/be02518b/attachment.bin From igloo at earth.li Sat Aug 16 13:04:44 2008 From: igloo at earth.li (Ian Lynagh) Date: Sat Aug 16 13:03:57 2008 Subject: darcs patch: Fix oversight in Control.OldException In-Reply-To: <1218893496.12804@zombie> References: <1218893496.12804@zombie> Message-ID: <20080816170443.GA20954@matrix.chaos.earth.li> On Sat, Aug 16, 2008 at 03:31:36PM +0200, Bertram Felgenhauer wrote: > Sat Aug 16 15:26:31 CEST 2008 Bertram Felgenhauer > * Fix oversight in Control.OldException > The NonTermination constructor slipped through in the Exception instance. Applied, thanks! Ian From igloo at earth.li Sat Aug 16 13:10:51 2008 From: igloo at earth.li (Ian Lynagh) Date: Sat Aug 16 13:09:59 2008 Subject: laziness of intersperse In-Reply-To: References: Message-ID: <20080816171051.GB20954@matrix.chaos.earth.li> Hi Reid, On Sat, Aug 16, 2008 at 10:51:14AM -0400, Reid Barton wrote: > (This is the same issue as http://www.haskell.org/pipermail/haskell/ > 2004-March/013739.html but there was no follow-up at that time.) Since then we have created the library submissions procedure: http://www.haskell.org/haskellwiki/Library_submissions If you follow that then everyone interested can have a say. We can then make the change if there is a consensus for it, without the issue getting forgotten about. Thanks Ian From duncan.coutts at worc.ox.ac.uk Sat Aug 16 12:29:49 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Sat Aug 16 13:11:08 2008 Subject: laziness of intersperse In-Reply-To: References: Message-ID: <1218904189.13639.97.camel@localhost> On Sat, 2008-08-16 at 10:51 -0400, Reid Barton wrote: > (This is the same issue as http://www.haskell.org/pipermail/haskell/ > 2004-March/013739.html but there was no follow-up at that time.) > > The intersperse library function is not as lazy as it could be. The > current definition of intersperse is > > intersperse :: a -> [a] -> [a] > intersperse _ [] = [] > intersperse _ [x] = [x] > intersperse sep (x:xs) = x : sep : intersperse sep xs > > For any list (x:xs) not containing _|_, intersperse sep (x:xs) is a > list of the form (x:...); yet intersperse sep (x:_|_) = _|_ because > the pattern match on the second equation diverges. A better > definition would be > > intersperse _ [] = [] > intersperse sep (x:xs) = x : intersperseWithPrefix xs > where intersperseWithPrefix [] = [] > intersperseWithPrefix (x:xs) = sep : x : > intersperseWithPrefix xs > > (slightly modified from http://www.haskell.org/pipermail/haskell/2004- > March/013741.html) > > An application: There was a question on #haskell about how to compute > the "ruler" sequence [1,2,1,3,1,2,1,4,1,2,1,3,1,2,1,5,...]. The > definition > > ruler = fix ((1:) . intersperse 1 . map (1+)) > > works with the properly lazy intersperse, but not with the > intersperse in Data.List. > > Comments on this new definition? Can it get added to Data.List? I think I've brought this up before too. While doing laziness testing (as part of the list fusion work) Don and I discovered that our new intersperse implementation was less strict than the H98 report version. We used: intersperse :: a -> [a] -> [a] intersperse _ [] = [] intersperse sep (x0:xs0) = x0 : go xs0 where go [] = [] go (x:xs) = sep : x : go xs which is more lazy and also faster than the standard implementation. It's pretty clear that the Haskell-prime List spec should use a version with this strictness property since the principle (as I understand it) was for the List module functions to be as lazy as possible. Duncan From bart at cs.pdx.edu Sat Aug 16 13:22:43 2008 From: bart at cs.pdx.edu (Bart Massey) Date: Sat Aug 16 13:22:04 2008 Subject: laziness of intersperse References: <20080816161321.GA21525@craft> Message-ID: A cleaner but still fully lazy version of intersperse might be intersperse sep (x1 : x2 : xs) = x1 : sep : intersperse sep (x2 : xs) intersperse _ l = l (I'd quote to give some context, but gmane says I can't unless I artificially pad the length of my new text. Is there any way to post to this list other than gmane?) From nominolo at googlemail.com Sat Aug 16 19:42:53 2008 From: nominolo at googlemail.com (Thomas Schilling) Date: Sat Aug 16 19:41:59 2008 Subject: laziness of intersperse In-Reply-To: References: <20080816161321.GA21525@craft> Message-ID: <916b84820808161642i538ffe1dp651c9a902b6e54d7@mail.gmail.com> On Sat, Aug 16, 2008 at 7:22 PM, Bart Massey wrote: > A cleaner but still fully lazy version of intersperse might be > > intersperse sep (x1 : x2 : xs) = > x1 : sep : intersperse sep (x2 : xs) > intersperse _ l = l Doesn't that fail on (x:_|_) ? Also it relies on constructor specialisation to be efficient. > (I'd quote to give some context, but gmane says I can't unless I artificially > pad the length of my new text. Is there any way to post to this list other than > gmane?) Subscribe? http://www.haskell.org/mailman/listinfo/libraries From nominolo at googlemail.com Sun Aug 17 05:00:52 2008 From: nominolo at googlemail.com (Thomas Schilling) Date: Sun Aug 17 04:59:57 2008 Subject: laziness of intersperse In-Reply-To: <200808170743.m7H7hkm5025378@wezen.cs.pdx.edu> References: <20080816161321.GA21525@craft> <916b84820808161642i538ffe1dp651c9a902b6e54d7@mail.gmail.com> <200808170743.m7H7hkm5025378@wezen.cs.pdx.edu> Message-ID: <916b84820808170200k7186bb53w864866cbfa82f212@mail.gmail.com> On Sun, Aug 17, 2008 at 9:43 AM, Barton C Massey wrote: > Maybe something like > > intersperse _ [] = [] > intersperse sep (x : xs) = > x : concatMap ((sep :) . (: [])) xs > > is clearer (maybe), but I doubt it's as efficient (although > I haven't checked). concatMap is a bit tricky when it comes to performance. I also think the low-level loop implementation that duncan showed is rather readable. > Thanks much for the corrections! (I hope it's OK to do this on-list.) Of course. Nobody gets it right the first time. (Well, maybe Oleg. But then again we have some doubts whether he is from this planet or, if he is, perhaps from the future.) From dons at galois.com Mon Aug 18 13:12:01 2008 From: dons at galois.com (Don Stewart) Date: Mon Aug 18 13:11:00 2008 Subject: [Haskell] ANN: witness 0.1, open-witness 0.1, "Witnesses and Open Witnesses" In-Reply-To: References: Message-ID: <20080818171201.GB30959@scytale.galois.com> ashley: > witness 0.1 > A witness is a value that witnesses some sort of constraint on some list > of type variables. This library provides support for simple witnesses, > that constrain a type variable to a single type, and equality witnesses, > that constrain two type variables to be the same type. The library also > provides classes for representatives, which are values that represent types. snip > open-witness 0.1 > Open witnesses (type IOWitness) are simple witnesses that can witness to > any type. However, they cannot be constructed, they can only be > generated in certain monads: snip > Hackage: > > Source: You can find these packages for your local Arch Linux distribution, http://aur.archlinux.org/packages.php?ID=19194 http://aur.archlinux.org/packages.php?ID=19195 Come on Debian! :) -- Don From dave at zednenem.com Mon Aug 18 19:32:57 2008 From: dave at zednenem.com (David Menendez) Date: Mon Aug 18 19:31:57 2008 Subject: Proposal: Reserved module namespace for packages on Hackage Message-ID: <49a77b7a0808181632t146996caj39555ca2984b6c00@mail.gmail.com> In the interests of reducing module name collisions, I suggest reserving part of the module name space for individual packages on Hackage. Specifically, I'm suggesting that a new top-level module name, "Lib", be added to the module naming conventions, and that the children of "Lib" be reserved for the Hackage package with the same name. That is, "Lib.Foo" and "Lib.Foo.*" would be reserved for the package "Foo" on Hackage. This would not require packages to *use* this namespace. However, packages that do use it would have a greatly reduced chance of conflicting with other packages. Implementation costs are minor. At most, we might want some code in Hackage to prevent packages from using module names reserved for other packages. At the least, all we need to do is add "Lib" to the list of allowable top-level module names. Developers who object to giving the provenance of a module in its name are free to take their chances with the rest of the module hierarchy. Mapping package names to module names is mostly straightforward. According to the Cabal documentation, a package name consists of one or more alphanumeric words separated by hyphens, where each word contains at least one letter. Since hyphens aren't allowed in module names, they would get mapped to underscores, which are not allowed in package names. Thus, "Lib.Foo_Bar" would be reserved for package "Foo-Bar". It's less obvious what to do with packages whose names start with lower-case letters or digits. I see three possible solutions: (a) Do not reserve module names for these packages. (b) Map these package names to module names in a way that avoids conflicts, e.g., prefixing the package name with "P'", which cannot occur in a package name. That is, package "foo" would get "Lib.P'foo". (c) Change the rules for package names on Hackage by disallowing package names which start with a digit or which differ from an existing package only in the case of the first letter, and reserve module names based on capitalized package names. That is, package "foo" would get "Lib.Foo", and Hackage would not accept a new package "Foo" if there was a preexisting "foo", and vice versa. My preference is for (c). In fact, I might go further and forbid any package whose name differs only in case from an existing package in Hackage. Further thoughts: (1) I chose "Lib" because it's short and, so far as I know, unused. "Hackage" might be a better choice, since the scheme depends on Hackage to prevent name collisions. (2) It was surprisingly difficult to find out the rules for valid package naming. None of the tutorials I found discussed choosing a valid name. The GHC documentation mentions that package names must have a specific form, but I couldn't find any description of it. (3) I did not find a definition of "alphanumeric" in the Cabal documentation. Does this include non-ASCII characters? (4) We could also reserve additional module names corresponding to specific versions of packages, e.g., "Foo-1.0" might get "Lib.Foo_1_0". This does not create ambiguity, because "Foo-1-0" is not a valid package name. From isaacdupree at charter.net Mon Aug 18 20:03:41 2008 From: isaacdupree at charter.net (Isaac Dupree) Date: Mon Aug 18 20:02:37 2008 Subject: Proposal: Reserved module namespace for packages on Hackage In-Reply-To: <49a77b7a0808181632t146996caj39555ca2984b6c00@mail.gmail.com> References: <49a77b7a0808181632t146996caj39555ca2984b6c00@mail.gmail.com> Message-ID: <48AA0DDD.7030203@charter.net> David Menendez wrote: > Implementation costs are minor. There is a serious cost: Sometimes another package is *supposed to* provide the same interface, including the same module names (e.g. forks or reimplementations. e.g. SOE). If Hackage rejected them, we would have a serious problem once people started depending on any package using a Lib. name. But it's not hard to pretty much avoid conflicts; you don't even need the Lib. prefix, you can just use the package name as your top-level module name. (right? or does hackage arbitrarily reject some module names?) -Isaac From s.clover at gmail.com Mon Aug 18 20:07:07 2008 From: s.clover at gmail.com (Sterling Clover) Date: Mon Aug 18 20:06:11 2008 Subject: Proposal: Reserved module namespace for packages on Hackage In-Reply-To: <49a77b7a0808181632t146996caj39555ca2984b6c00@mail.gmail.com> References: <49a77b7a0808181632t146996caj39555ca2984b6c00@mail.gmail.com> Message-ID: I tend to think this is a really bad idea. Although things get messy and there are plenty of corner cases, it seems to me the current system, haphazard as it is, is closer to the "right way." If, e.g., I want a Maybe transformer, I want to import it from Control.Monad.MaybeT, not from Lib.MaybeT. That way I can sort my imports sanely and see all my Control things in one place, no matter their provenance, all my data structures in another, be they from collections or bloom filters from hackage, etc. The other problem is that either everything eventually goes under lib, which creates the same problem again, or there is an implicit set of exceptions for things which, although not part of the official libraries (which we're trying to reduce, remember) are obviously too "standard" for lib (e.g., HTTP, and such). The problem here is that maybe this doesn't scale, since it requires hackage contributors to think about the package namespace as a whole, and some vigilance in that regard, the need to mark packages as depreciated properly, etc. But on the other hand, arbitrary namespacing leads to fragmentation, with everyone reimplementing things under their own hierarchy, and encouraging uses of standard(ish) namespaces also contributes to a mindset where people will pare down packages into lots of little reusable conceptual units that only do one thing well. The problem -- duplication of functionality and fragmentation -- is a real one, but dealing with it through throwing namespacing to the wind won't solve the underlying issues, which I think need to be addressed though the Haskell community guiding the direction of various efforts, and not through an artificial measure that makes fragmentation less immediately painful while doing nothing to mitigate the long term consequences. --S On Aug 18, 2008, at 7:32 PM, David Menendez wrote: > In the interests of reducing module name collisions, I suggest > reserving part of the module name space for individual packages on > Hackage. Specifically, I'm suggesting that a new top-level module > name, "Lib", be added to the module naming conventions, and that the > children of "Lib" be reserved for the Hackage package with the same > name. That is, "Lib.Foo" and "Lib.Foo.*" would be reserved for the > package "Foo" on Hackage. > > This would not require packages to *use* this namespace. However, > packages that do use it would have a greatly reduced chance of > conflicting with other packages. > > Implementation costs are minor. At most, we might want some code in > Hackage to prevent packages from using module names reserved for other > packages. At the least, all we need to do is add "Lib" to the list of > allowable top-level module names. Developers who object to giving the > provenance of a module in its name are free to take their chances with > the rest of the module hierarchy. > > > Mapping package names to module names is mostly straightforward. > According to the Cabal documentation, a package name consists of one > or more alphanumeric words separated by hyphens, where each word > contains at least one letter. Since hyphens aren't allowed in module > names, they would get mapped to underscores, which are not allowed in > package names. Thus, "Lib.Foo_Bar" would be reserved for package > "Foo-Bar". > > It's less obvious what to do with packages whose names start with > lower-case letters or digits. I see three possible solutions: > > (a) Do not reserve module names for these packages. > > (b) Map these package names to module names in a way that avoids > conflicts, e.g., prefixing the package name with "P'", which cannot > occur in a package name. That is, package "foo" would get "Lib.P'foo". > > (c) Change the rules for package names on Hackage by disallowing > package names which start with a digit or which differ from an > existing package only in the case of the first letter, and reserve > module names based on capitalized package names. That is, package > "foo" would get "Lib.Foo", and Hackage would not accept a new package > "Foo" if there was a preexisting "foo", and vice versa. > > My preference is for (c). In fact, I might go further and forbid any > package whose name differs only in case from an existing package in > Hackage. > > > > Further thoughts: > > (1) I chose "Lib" because it's short and, so far as I know, unused. > "Hackage" might be a better choice, since the scheme depends on > Hackage to prevent name collisions. > > (2) It was surprisingly difficult to find out the rules for valid > package naming. None of the tutorials I found discussed choosing a > valid name. The GHC documentation mentions that package names must > have a specific form, but I couldn't find any description of it. > > (3) I did not find a definition of "alphanumeric" in the Cabal > documentation. Does this include non-ASCII characters? > > (4) We could also reserve additional module names corresponding to > specific versions of packages, e.g., "Foo-1.0" might get > "Lib.Foo_1_0". This does not create ambiguity, because "Foo-1-0" is > not a valid package name. > _______________________________________________ > Libraries mailing list > Libraries@haskell.org > http://www.haskell.org/mailman/listinfo/libraries From dave at zednenem.com Mon Aug 18 20:48:12 2008 From: dave at zednenem.com (David Menendez) Date: Mon Aug 18 20:47:12 2008 Subject: Proposal: Reserved module namespace for packages on Hackage In-Reply-To: <48AA0DDD.7030203@charter.net> References: <49a77b7a0808181632t146996caj39555ca2984b6c00@mail.gmail.com> <48AA0DDD.7030203@charter.net> Message-ID: <49a77b7a0808181748o50d971a7w6e1eaf8c49e5d84e@mail.gmail.com> On Mon, Aug 18, 2008 at 8:03 PM, Isaac Dupree wrote: > David Menendez wrote: >> >> Implementation costs are minor. > > There is a serious cost: Sometimes another package is *supposed to* provide > the same interface, including the same module names (e.g. forks or > reimplementations. e.g. SOE). If Hackage rejected them, we would have a > serious problem once people started depending on any package using a Lib. > name. Would we? How many packages out there are drop-in replacements? Even things like Data.List.Stream, which is a drop-in replacement for Data.List, uses a different module name. The packages I've seen that abstract over other packages tend to use preprocessor commands to get the right modules. I can see your point about forks. That's one case where it might be better to use the same module names as a different package. But I'm leery of relying on two modules with the same name having the same interface. The ideal solution would be something like the package mounting proposal, but that has a major implementation cost. This is more of a stop-gap measure that could be implemented today. > But it's not hard to pretty much avoid conflicts; you don't even need the > Lib. prefix, you can just use the package name as your top-level module > name. (right? or does hackage arbitrarily reject some module names?) As I understand it, Hackage complains if you use a top-level name that isn't on the approved list. Putting the package name at the top-level is also a possibility, but putting it one level down is more future-proof. Really, all my proposal needs is to add "Lib" to the list of acceptable top-level names, and to have some document on the web explain what it's for. -- Dave Menendez From dave at zednenem.com Mon Aug 18 20:58:14 2008 From: dave at zednenem.com (David Menendez) Date: Mon Aug 18 20:57:14 2008 Subject: Proposal: Reserved module namespace for packages on Hackage In-Reply-To: References: <49a77b7a0808181632t146996caj39555ca2984b6c00@mail.gmail.com> Message-ID: <49a77b7a0808181758x4cdd873dnb8787aa8d4fc51b1@mail.gmail.com> On Mon, Aug 18, 2008 at 8:07 PM, Sterling Clover wrote: > I tend to think this is a really bad idea. Although things get messy and > there are plenty of corner cases, it seems to me the current system, > haphazard as it is, is closer to the "right way." If, e.g., I want a Maybe > transformer, I want to import it from Control.Monad.MaybeT, not from > Lib.MaybeT. That way I can sort my imports sanely and see all my Control > things in one place, no matter their provenance, all my data structures in > another, be they from collections or bloom filters from hackage, etc. Unless you're mechanically sorting your module imports, I don't see how the Lib names would prevent that. As far as Haskell is concerned, module names are entirely arbitrary. > The other problem is that either everything eventually goes under lib, which > creates the same problem again, or there is an implicit set of exceptions > for things which, although not part of the official libraries (which we're > trying to reduce, remember) are obviously too "standard" for lib (e.g., > HTTP, and such). How does putting everything under Lib create the same problem again? Hackage already forbids the multiple packages from having the same name, so the reserved names for each package would be disjoint. -- Dave Menendez From duncan.coutts at worc.ox.ac.uk Mon Aug 18 21:24:26 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Mon Aug 18 21:48:33 2008 Subject: Proposal: Reserved module namespace for packages on Hackage In-Reply-To: <49a77b7a0808181632t146996caj39555ca2984b6c00@mail.gmail.com> References: <49a77b7a0808181632t146996caj39555ca2984b6c00@mail.gmail.com> Message-ID: <1219109066.13639.239.camel@localhost> On Mon, 2008-08-18 at 19:32 -0400, David Menendez wrote: > In the interests of reducing module name collisions, I suggest > reserving part of the module name space for individual packages on > Hackage. Specifically, I'm suggesting that a new top-level module > name, "Lib", be added to the module naming conventions, and that the > children of "Lib" be reserved for the Hackage package with the same > name. That is, "Lib.Foo" and "Lib.Foo.*" would be reserved for the > package "Foo" on Hackage. Note that this is entirely contrary to the existing (and well established) convention of naming according to the purpose / content of the module rather than the name of the implementation. What I mean is, it's a significant change. I'll throw in my opinion too. :-) I don't think it's necessary. The existing recommendations on naming mean we already don't get too many clashes, eg we get Database.HDBC and ?Database.HSQL. Even when names do clash they're typically implementations of similar things and how many packages need both at once? It's more common to pick one implementation of some functionality. It would certainly be interesting to make a service on hackage to work out what packages do have clashing names so that maintainers can work out with each other how to resolve things. For example suppose we have two packages implementing Text.PrettyPrint then we'd ask both to use ?Text.PrettyPrint.ImplName. If we allowed overlap in the modules exported by the packages in use then both can still export ?Text.PrettyPrint that just re-exports ?Text.PrettyPrint.ImplName. That way one can pick and no existing code breaks. So far in practise it seems that overlap is a pretty minor problem and could easily be resolved in most instances with just a little communication. It's not obvious that we need something much more heavyweight. If we really do need more then package-qualified imports is probably a better approach than a big change in module naming conventions. Duncan From dave at zednenem.com Mon Aug 18 23:48:28 2008 From: dave at zednenem.com (David Menendez) Date: Mon Aug 18 23:47:27 2008 Subject: Proposal: Reserved module namespace for packages on Hackage In-Reply-To: <1219109066.13639.239.camel@localhost> References: <49a77b7a0808181632t146996caj39555ca2984b6c00@mail.gmail.com> <1219109066.13639.239.camel@localhost> Message-ID: <49a77b7a0808182048v2996df24veb9d965a03a5a361@mail.gmail.com> On Mon, Aug 18, 2008 at 9:24 PM, Duncan Coutts wrote: > On Mon, 2008-08-18 at 19:32 -0400, David Menendez wrote: >> In the interests of reducing module name collisions, I suggest >> reserving part of the module name space for individual packages on >> Hackage. Specifically, I'm suggesting that a new top-level module >> name, "Lib", be added to the module naming conventions, and that the >> children of "Lib" be reserved for the Hackage package with the same >> name. That is, "Lib.Foo" and "Lib.Foo.*" would be reserved for the >> package "Foo" on Hackage. > > Note that this is entirely contrary to the existing (and well > established) convention of naming according to the purpose / content of > the module rather than the name of the implementation. > > What I mean is, it's a significant change. Is it? Look at the XML category at Hackage. formlets - no common prefix generic-xml - all modules prefixed with Xml HaXml - every module is prefixed with Text.XML.HaXml hexpat - both modules are prefixed with Text.XML.Expat HXQ - one module, prefixed with Text.XML.HXQ hxt - 95 of 113 modules are prefixed with Text.XML.HXT libxml - all modules prefixed with Text.XML.LibXML tagsoup - 7 of 8 modules prefixed with Text.HTML.TagSoup xml - all modules prefixed with Text.XML.Light Selecting things semi-randomly from the parser category, I see: attoparsec - all modules prefixed with Data.ParserCombinators.Attoparsec binary - all modules prefixed with Data.Binary binary-strict - all modules prefixed with Data.Binary.Strict bytestringparser - all modules prefixed with Data.ParserCombinators.Attoparsec PArrows - all modules prefixed with Text.ParserCombinators.PArrow Parsec - all modules prefixed with Text.ParserCombinators.Parsec parsely - all modules prefixed with Text.ParserCombinators.Parsely polyparse - no common prefix uulib - all modules prefixed with UU To me, it looks like a common pattern is to give most or all of the modules in a package a common prefix consisting of a general classification and the package name (or a close variant). All I'm suggesting is to give library authors the option to drop the classification part. Trying to create a collaborative, hierarchical classification system is a sucker's game. That's why Hackage itself uses tags. -- Dave Menendez From jpm at cs.uu.nl Tue Aug 19 05:37:07 2008 From: jpm at cs.uu.nl (=?ISO-8859-1?Q?Jos=E9_Pedro_Magalh=E3es?=) Date: Tue Aug 19 05:36:06 2008 Subject: cabal-install calls setup with unrecognised distpref flag In-Reply-To: <52f14b210808060238s40d14c91pb124ca820955c199@mail.gmail.com> References: <52f14b210808060238s40d14c91pb124ca820955c199@mail.gmail.com> Message-ID: <52f14b210808190237v7874b51bxa53e7976c6bce0ac@mail.gmail.com> Hello all, I decided to bring this to libraries@haskell.org too in case anyone had the same problem before. Cheers, Pedro On Wed, Aug 6, 2008 at 11:38, Jos? Pedro Magalh?es wrote: > Hello all, > > Recently, while developing a package here in the group, we changed the > build-type from "Simple" to "Custom" to allow for "cabal test". The Setup.hs > simply calls defaultMainWithHooks with simpleUserHooks where runTests has > been redefined. > > The problem is that this change gives me the following error when invoking > "cabal build" (after invoking "cabal configure"): > setup.exe: Unrecognised flags: > --distpref=dist > > runhaskell Setup.hs build works fine, though. This happens on a win64 > machine, cabal-install version 0.5.1 using version 1.4.0.1 of the library. > On a mac there seem to be no problems, though. > > Am I doing something wrong? > > > Thanks, > Pedro > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080819/44236630/attachment-0001.htm From duncan.coutts at worc.ox.ac.uk Tue Aug 19 06:40:28 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Tue Aug 19 06:39:27 2008 Subject: cabal-install calls setup with unrecognised distpref flag In-Reply-To: <52f14b210808190237v7874b51bxa53e7976c6bce0ac@mail.gmail.com> References: <52f14b210808060238s40d14c91pb124ca820955c199@mail.gmail.com> <52f14b210808190237v7874b51bxa53e7976c6bce0ac@mail.gmail.com> Message-ID: <1219142428.13639.257.camel@localhost> On Tue, 2008-08-19 at 11:37 +0200, Jos? Pedro Magalh?es wrote: > Hello all, > > I decided to bring this to libraries@haskell.org too in case anyone > had the same problem before. It'll be a bug in cabal-install. It's passing a flag to a Setup.hs that was built with an older version of the Cabal lib that does not understand a new flag. What's not obvious to me is why it's passing the flag at all. I'd expect it only to do that if you passed --distpref when you invoke cabal-install. I've filed a ticket so we do not forget. If you have any more details, add them to the ticket. Actually, could you run the failing command with -v3 and show us the output (including the full command used). I'm interested in what value of distpref is being passed as that might give us a clue as to where it is coming from. Thanks. Duncan > On Wed, Aug 6, 2008 at 11:38, Jos? Pedro Magalh?es > wrote: > Hello all, > > Recently, while developing a package here in the group, we > changed the build-type from "Simple" to "Custom" to allow for > "cabal test". The Setup.hs simply calls defaultMainWithHooks > with simpleUserHooks where runTests has been redefined. > > The problem is that this change gives me the following error > when invoking "cabal build" (after invoking "cabal > configure"): > setup.exe: Unrecognised flags: > --distpref=dist > > runhaskell Setup.hs build works fine, though. This happens on > a win64 machine, cabal-install version 0.5.1 using version > 1.4.0.1 of the library. On a mac there seem to be no problems, > though. > > Am I doing something wrong? > > > Thanks, > Pedro > > > > _______________________________________________ > Libraries mailing list > Libraries@haskell.org > http://www.haskell.org/mailman/listinfo/libraries From duncan.coutts at worc.ox.ac.uk Tue Aug 19 06:41:56 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Tue Aug 19 06:40:57 2008 Subject: cabal-install calls setup with unrecognised distpref flag In-Reply-To: <1219142428.13639.257.camel@localhost> References: <52f14b210808060238s40d14c91pb124ca820955c199@mail.gmail.com> <52f14b210808190237v7874b51bxa53e7976c6bce0ac@mail.gmail.com> <1219142428.13639.257.camel@localhost> Message-ID: <1219142516.13639.259.camel@localhost> On Tue, 2008-08-19 at 11:40 +0100, Duncan Coutts wrote: > On Tue, 2008-08-19 at 11:37 +0200, Jos? Pedro Magalh?es wrote: > > Hello all, > > > > I decided to bring this to libraries@haskell.org too in case anyone > > had the same problem before. > > It'll be a bug in cabal-install. It's passing a flag to a Setup.hs that > was built with an older version of the Cabal lib that does not > understand a new flag. What's not obvious to me is why it's passing the > flag at all. I'd expect it only to do that if you passed --distpref when > you invoke cabal-install. > > I've filed a ticket so we do not forget. If you have any more details, > add them to the ticket. Sorry, forgot to mention: http://hackage.haskell.org/trac/hackage/ticket/328 If you don't have an account see the front page for how to register or login with the guest account: ?http://hackage.haskell.org/trac/hackage/ Duncan From iavor.diatchki at gmail.com Tue Aug 19 12:20:49 2008 From: iavor.diatchki at gmail.com (Iavor Diatchki) Date: Tue Aug 19 12:19:47 2008 Subject: Proposal: Reserved module namespace for packages on Hackage In-Reply-To: <49a77b7a0808182048v2996df24veb9d965a03a5a361@mail.gmail.com> References: <49a77b7a0808181632t146996caj39555ca2984b6c00@mail.gmail.com> <1219109066.13639.239.camel@localhost> <49a77b7a0808182048v2996df24veb9d965a03a5a361@mail.gmail.com> Message-ID: <5ab17e790808190920s783c544bv7720051e8c948a8c@mail.gmail.com> Hello, I also don't think that we need to prefix everything with Lib. However, I also do not like the current style of naming library packages, where a single package can sprinkle modules all over the hierarchy because: - It makes it hard to figure out where modules come from (e.g., when I see an import in the source code, it is hard to tell what library it came from), - The reverse problem also holds---when you look at the docs, it is hard to tell which modules are provided by a given package, - It discourages diversity (which some people may say is a good thing :-). What I mean is that there is a kind of "land rush" to stake out the good names in the hierarchy (I know that multiple packages can provide the same module, but it is still a pain, especially if you want _some_ modules from two conflicting packages). - I don't think the system scales that well. For example, if I was to create a package that draws graphs, should I put it under Data.Graph, and hope that no one uses both it, and the graph modules. And does that mean that to pick names for my modules I have to know all the modules in all libraries out there? - There are much better ways to classify modules by their purposes than the single hierarchy imposed by the module name space (think labels, tags, categories, keywords, all the usual ways people use on the internet to classify things). I think that it is a much better idea to use the package name as the top-level module name space, as we have already put some effort in ensuring that these are more or less unique. -Iavor On Mon, Aug 18, 2008 at 8:48 PM, David Menendez wrote: > On Mon, Aug 18, 2008 at 9:24 PM, Duncan Coutts > wrote: >> On Mon, 2008-08-18 at 19:32 -0400, David Menendez wrote: >>> In the interests of reducing module name collisions, I suggest >>> reserving part of the module name space for individual packages on >>> Hackage. Specifically, I'm suggesting that a new top-level module >>> name, "Lib", be added to the module naming conventions, and that the >>> children of "Lib" be reserved for the Hackage package with the same >>> name. That is, "Lib.Foo" and "Lib.Foo.*" would be reserved for the >>> package "Foo" on Hackage. >> >> Note that this is entirely contrary to the existing (and well >> established) convention of naming according to the purpose / content of >> the module rather than the name of the implementation. >> >> What I mean is, it's a significant change. > > Is it? > > Look at the XML category at Hackage. > > formlets - no common prefix > generic-xml - all modules prefixed with Xml > HaXml - every module is prefixed with Text.XML.HaXml > hexpat - both modules are prefixed with Text.XML.Expat > HXQ - one module, prefixed with Text.XML.HXQ > hxt - 95 of 113 modules are prefixed with Text.XML.HXT > libxml - all modules prefixed with Text.XML.LibXML > tagsoup - 7 of 8 modules prefixed with Text.HTML.TagSoup > xml - all modules prefixed with Text.XML.Light > > Selecting things semi-randomly from the parser category, I see: > > attoparsec - all modules prefixed with Data.ParserCombinators.Attoparsec > binary - all modules prefixed with Data.Binary > binary-strict - all modules prefixed with Data.Binary.Strict > bytestringparser - all modules prefixed with Data.ParserCombinators.Attoparsec > PArrows - all modules prefixed with Text.ParserCombinators.PArrow > Parsec - all modules prefixed with Text.ParserCombinators.Parsec > parsely - all modules prefixed with Text.ParserCombinators.Parsely > polyparse - no common prefix > uulib - all modules prefixed with UU > > To me, it looks like a common pattern is to give most or all of the > modules in a package a common prefix consisting of a general > classification and the package name (or a close variant). All I'm > suggesting is to give library authors the option to drop the > classification part. Trying to create a collaborative, hierarchical > classification system is a sucker's game. That's why Hackage itself > uses tags. > > -- > Dave Menendez > > _______________________________________________ > Libraries mailing list > Libraries@haskell.org > http://www.haskell.org/mailman/listinfo/libraries > From allbery at ece.cmu.edu Tue Aug 19 13:53:53 2008 From: allbery at ece.cmu.edu (Brandon S. Allbery KF8NH) Date: Tue Aug 19 13:52:55 2008 Subject: Proposal: Reserved module namespace for packages on Hackage In-Reply-To: <5ab17e790808190920s783c544bv7720051e8c948a8c@mail.gmail.com> References: <49a77b7a0808181632t146996caj39555ca2984b6c00@mail.gmail.com> <1219109066.13639.239.camel@localhost> <49a77b7a0808182048v2996df24veb9d965a03a5a361@mail.gmail.com> <5ab17e790808190920s783c544bv7720051e8c948a8c@mail.gmail.com> Message-ID: <48754C50-0F50-466D-B400-431EA2CABE54@ece.cmu.edu> On 2008 Aug 19, at 12:20, Iavor Diatchki wrote: > I think that it is a much better idea to use the package name as the > top-level module name space, as we have already put some effort in > ensuring that these are more or less unique. May I suggest the Alexandrian solution? Module aliases. "alias Foo- package.Data.HashSet as Data.HashSet". -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH From ndmitchell at gmail.com Wed Aug 20 03:49:01 2008 From: ndmitchell at gmail.com (Neil Mitchell) Date: Wed Aug 20 03:47:56 2008 Subject: Cabal question: Generated data files Message-ID: <404396ef0808200049i46ec8e83sdafdd8f24a523431@mail.gmail.com> Hi, First off, this list seems a bit off-topic for a question about Cabal, but its what the web page told me to do: http://haskell.org/cabal/ - if this isn't the place for this email, could someone update the web page? The question is about generated data files, specifically for Hoogle. Hoogle generates search databases, which I want to be installed as read-only data, in datadir. For a release I want to bundle up these databases in the tarball, pregenerated. I don't want to include these databases in the darcs version (they are fairly specific to a particular version and regularly changing), but I do want people to be able to build the darcs version using cabal. To generate these databases requires a second Haskell program (included in the darcs repo), and an installed copy of the same version of Hoogle. So, I want to generate database files, which can only be done after install time, but which I then want to be installed. Is there any way to acheive this? Or some better idea? Currently I just generate the databases and have the .cabal file reference them, which means the darcs version can't be compiled with Cabal. Thanks Neil From lemming at henning-thielemann.de Wed Aug 20 04:21:45 2008 From: lemming at henning-thielemann.de (Henning Thielemann) Date: Wed Aug 20 04:20:47 2008 Subject: Proposal: Reserved module namespace for packages on Hackage In-Reply-To: <48754C50-0F50-466D-B400-431EA2CABE54@ece.cmu.edu> References: <49a77b7a0808181632t146996caj39555ca2984b6c00@mail.gmail.com> <1219109066.13639.239.camel@localhost> <49a77b7a0808182048v2996df24veb9d965a03a5a361@mail.gmail.com> <5ab17e790808190920s783c544bv7720051e8c948a8c@mail.gmail.com> <48754C50-0F50-466D-B400-431EA2CABE54@ece.cmu.edu> Message-ID: On Tue, 19 Aug 2008, Brandon S. Allbery KF8NH wrote: > On 2008 Aug 19, at 12:20, Iavor Diatchki wrote: >> I think that it is a much better idea to use the package name as the >> top-level module name space, as we have already put some effort in >> ensuring that these are more or less unique. > > > May I suggest the Alexandrian solution? Module aliases. "alias > Foo-package.Data.HashSet as Data.HashSet". Or what about using Lib top-level for new libraries written by only a few authors and used by only a few users. When it becomes clear that many people need it or there are multiple packages for the same purpose, they can start a joint effort to create "the real thing" in the existing module hierarchy. This way 'Lib' would be the sand-box and 'Data' and friends are for the "standards". From alfonso.acosta at gmail.com Wed Aug 20 08:15:01 2008 From: alfonso.acosta at gmail.com (Alfonso Acosta) Date: Wed Aug 20 08:13:56 2008 Subject: Cabal question: Generated data files In-Reply-To: <404396ef0808200049i46ec8e83sdafdd8f24a523431@mail.gmail.com> References: <404396ef0808200049i46ec8e83sdafdd8f24a523431@mail.gmail.com> Message-ID: <6a7c66fc0808200515s52a285d1i2075d1ab4ef6e379@mail.gmail.com> Hi Neil, Probably you've already considered this option but, How about a cabal build-posthook a) detects if the database-generating program is present (I asume it's only present in the darcs version) b) if present, generates the databases and puts them in the location supplied in the cabal file. I'm not aware though, if configure would complain about the databases not being present at first instance. On Wed, Aug 20, 2008 at 9:49 AM, Neil Mitchell wrote: > Hi, > > First off, this list seems a bit off-topic for a question about Cabal, > but its what the web page told me to do: http://haskell.org/cabal/ - > if this isn't the place for this email, could someone update the web > page? > > The question is about generated data files, specifically for Hoogle. > > Hoogle generates search databases, which I want to be installed as > read-only data, in datadir. For a release I want to bundle up these > databases in the tarball, pregenerated. I don't want to include these > databases in the darcs version (they are fairly specific to a > particular version and regularly changing), but I do want people to be > able to build the darcs version using cabal. To generate these > databases requires a second Haskell program (included in the darcs > repo), and an installed copy of the same version of Hoogle. > > So, I want to generate database files, which can only be done after > install time, but which I then want to be installed. Is there any way > to acheive this? Or some better idea? Currently I just generate the > databases and have the .cabal file reference them, which means the > darcs version can't be compiled with Cabal. > > Thanks > > Neil > _______________________________________________ > Libraries mailing list > Libraries@haskell.org > http://www.haskell.org/mailman/listinfo/libraries > From jpm at cs.uu.nl Wed Aug 20 08:23:41 2008 From: jpm at cs.uu.nl (=?ISO-8859-1?Q?Jos=E9_Pedro_Magalh=E3es?=) Date: Wed Aug 20 08:22:35 2008 Subject: [Hs-Generics] Developing SYB, packaging issues In-Reply-To: <20080805174009.GA1677@matrix.chaos.earth.li> References: <52f14b210808040449o4189511k1cbe6d430b6fb6da@mail.gmail.com> <3c6288ab0808050746p4b5bd85aqe2639318ef3a5585@mail.gmail.com> <20080805174009.GA1677@matrix.chaos.earth.li> Message-ID: <52f14b210808200523h37d5ae7em2cc5bf8a79df2b35@mail.gmail.com> Hello, I understand that with the proposed base package breakup [1], SYB will be moved to a separate package. But I still don't know how this will reflect on the development of the library. In particular: 1) Where is the source code going to be hosted? Here in Utrecht we currently have a repository with several (cabalized) generic programming libraries, SYB included. But maybe SYB will stay in the same repository as GHC? 2) Can development proceed independently of GHC, i.e. can a new version of SYB be released without a new version of GHC? 3) How does the separation affect the automatic instance deriving mechanism? Thanks, Pedro [1] http://hackage.haskell.org/trac/ghc/ticket/1338 On Tue, Aug 5, 2008 at 19:40, Ian Lynagh wrote: > On Tue, Aug 05, 2008 at 04:46:50PM +0200, Sean Leather wrote: > > > > I think SYB should be extracted from 'base' into a package. > > I'll be sending a message about this soon. > > > Thanks > Ian > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080820/7dbc3adf/attachment-0001.htm From ndmitchell at gmail.com Wed Aug 20 08:35:51 2008 From: ndmitchell at gmail.com (Neil Mitchell) Date: Wed Aug 20 08:34:45 2008 Subject: Cabal question: Generated data files In-Reply-To: <6a7c66fc0808200515s52a285d1i2075d1ab4ef6e379@mail.gmail.com> References: <404396ef0808200049i46ec8e83sdafdd8f24a523431@mail.gmail.com> <6a7c66fc0808200515s52a285d1i2075d1ab4ef6e379@mail.gmail.com> Message-ID: <404396ef0808200535l107bc53ci9707aeedd3a91cd2@mail.gmail.com> Hi Alfonso, > Probably you've already considered this option but, How about a cabal > build-posthook I considered lots of options with hooks at loads of different points, but couldn't figure out what the right point is :-) > a) detects if the database-generating program is present (I asume it's > only present in the darcs version) That's easy enough to do. The other alternative is to detect if the databases I want to install aren't present, and generate them only then. > b) if present, generates the databases and puts them in the location > supplied in the cabal file. > I'm not aware though, if configure would complain about the databases > not being present at first instance. That is also what I wondered. Also, as the hoogle binary is required to generate databases (it is called by the database generation program), I will have to find the hoogle binary at this point. Unfortunately it won't have been installed in $PATH, and I'm not sure if Cabal guarantees where it has put the binary. I'll probably try implementing something today. Thanks Neil From alfonso.acosta at gmail.com Wed Aug 20 08:41:53 2008 From: alfonso.acosta at gmail.com (Alfonso Acosta) Date: Wed Aug 20 08:40:47 2008 Subject: Cabal question: Generated data files In-Reply-To: <404396ef0808200535l107bc53ci9707aeedd3a91cd2@mail.gmail.com> References: <404396ef0808200049i46ec8e83sdafdd8f24a523431@mail.gmail.com> <6a7c66fc0808200515s52a285d1i2075d1ab4ef6e379@mail.gmail.com> <404396ef0808200535l107bc53ci9707aeedd3a91cd2@mail.gmail.com> Message-ID: <6a7c66fc0808200541r3b532613u4347fa8b3761012d@mail.gmail.com> On Wed, Aug 20, 2008 at 2:35 PM, Neil Mitchell wrote: > That is also what I wondered. Also, as the hoogle binary is required > to generate databases (it is called by the database generation > program), I will have to find the hoogle binary at this point. > Unfortunately it won't have been installed in $PATH, and I'm not sure > if Cabal guarantees where it has put the binary. Well, I guess that, If you create a Cabal posthook as I suggested, you could just use the inplace hoogle binary under dist/ (assuming, of course, you don't need any configuration value from Paths_Hoogle to generate the databases). From alfonso.acosta at gmail.com Wed Aug 20 09:04:09 2008 From: alfonso.acosta at gmail.com (Alfonso Acosta) Date: Wed Aug 20 09:03:03 2008 Subject: Cabal question: Generated data files In-Reply-To: <404396ef0808200535l107bc53ci9707aeedd3a91cd2@mail.gmail.com> References: <404396ef0808200049i46ec8e83sdafdd8f24a523431@mail.gmail.com> <6a7c66fc0808200515s52a285d1i2075d1ab4ef6e379@mail.gmail.com> <404396ef0808200535l107bc53ci9707aeedd3a91cd2@mail.gmail.com> Message-ID: <6a7c66fc0808200604k1a32b78eu5ec3c72a86656c1e@mail.gmail.com> On Wed, Aug 20, 2008 at 2:35 PM, Neil Mitchell wrote: > The other alternative is to detect if the > databases I want to install aren't present, and generate them only > then. That alternative has a big disadvantage. People pulling patches which change the database format won't rebuild the databases (they where already there from a previous installation) and will end up having a broken installation. From ndmitchell at gmail.com Wed Aug 20 09:23:06 2008 From: ndmitchell at gmail.com (Neil Mitchell) Date: Wed Aug 20 09:22:00 2008 Subject: Cabal question: Generated data files In-Reply-To: <6a7c66fc0808200604k1a32b78eu5ec3c72a86656c1e@mail.gmail.com> References: <404396ef0808200049i46ec8e83sdafdd8f24a523431@mail.gmail.com> <6a7c66fc0808200515s52a285d1i2075d1ab4ef6e379@mail.gmail.com> <404396ef0808200535l107bc53ci9707aeedd3a91cd2@mail.gmail.com> <6a7c66fc0808200604k1a32b78eu5ec3c72a86656c1e@mail.gmail.com> Message-ID: <404396ef0808200623t2605c376m135166fecd79e25d@mail.gmail.com> Hi Alfonso, > That alternative has a big disadvantage. People pulling patches which > change the database format won't rebuild the databases (they where > already there from a previous installation) and will end up having a > broken installation. I realise that, I'm hoping to do it like a "make" based thing, to some degree. I'm just trying to work around Cabal bug (http://hackage.haskell.org/trac/hackage/ticket/329), then I can make the databases regenerate if the .exe is newer. Thanks Neil From ndmitchell at gmail.com Wed Aug 20 10:16:19 2008 From: ndmitchell at gmail.com (Neil Mitchell) Date: Wed Aug 20 10:15:13 2008 Subject: Cabal question: Generated data files In-Reply-To: <404396ef0808200623t2605c376m135166fecd79e25d@mail.gmail.com> References: <404396ef0808200049i46ec8e83sdafdd8f24a523431@mail.gmail.com> <6a7c66fc0808200515s52a285d1i2075d1ab4ef6e379@mail.gmail.com> <404396ef0808200535l107bc53ci9707aeedd3a91cd2@mail.gmail.com> <6a7c66fc0808200604k1a32b78eu5ec3c72a86656c1e@mail.gmail.com> <404396ef0808200623t2605c376m135166fecd79e25d@mail.gmail.com> Message-ID: <404396ef0808200716v2f9f4e9bl38dbdca7d56af9df@mail.gmail.com> Hi > I realise that, I'm hoping to do it like a "make" based thing, to some > degree. I'm just trying to work around Cabal bug > (http://hackage.haskell.org/trac/hackage/ticket/329), then I can make > the databases regenerate if the .exe is newer. It appears its not very easy to work around that Cabal bug, and Cabal HEAD won't work with cabal-install HEAD, so I'm stuck. If I do generate databases, it will have to be unconditionally, which seems like an awful waste of time even if nothing has changed. Generating databases takes around a minute. Thanks Neil From bart at cs.pdx.edu Wed Aug 20 14:17:20 2008 From: bart at cs.pdx.edu (Bart Massey) Date: Wed Aug 20 14:16:27 2008 Subject: Cabal question: Generated data files References: <404396ef0808200049i46ec8e83sdafdd8f24a523431@mail.gmail.com> <6a7c66fc0808200515s52a285d1i2075d1ab4ef6e379@mail.gmail.com> <404396ef0808200535l107bc53ci9707aeedd3a91cd2@mail.gmail.com> <6a7c66fc0808200604k1a32b78eu5ec3c72a86656c1e@mail.gmail.com> <404396ef0808200623t2605c376m135166fecd79e25d@mail.gmail.com> <404396ef0808200716v2f9f4e9bl38dbdca7d56af9df@mail.gmail.com> Message-ID: Neil Mitchell gmail.com> writes: > If I do > generate databases, it will have to be unconditionally, which seems > like an awful waste of time even if nothing has changed. Generating > databases takes around a minute. Have your database generator do the check itself by inserting some version marker for the databases, or whatever? If the databases exist and are compatible, just touch them instead of regenerating them. Ugly, but should work. Bart Massey bart@cs.pdx.edu From claus.reinke at talk21.com Thu Aug 21 04:10:27 2008 From: claus.reinke at talk21.com (Claus Reinke) Date: Thu Aug 21 04:09:27 2008 Subject: [Hs-Generics] Re: Syb Renovations? Issues with Data.Generics References: <018401c8f0dd$98126480$3d058351@cr3lt><20080728222302.GA24020@matrix.chaos.earth.li><027a01c8f1b1$1d1cef00$71338351@cr3lt><20080729195944.GA15169@matrix.chaos.earth.li><026f01c8f299$9f511b70$36168351@cr3lt><20080731010538.GA20317@matrix.chaos.earth.li> <006101c8f2f9$3c39ab50$0b1b7ad5@cr3lt> Message-ID: <009a01c90365$60eb1890$12168351@cr3lt> Some of you already know, but it seems I forgot to mention this here - my code has moved to a darcs repo, with a little bit of documentation and a README summarizing the issues. See my toolbox for more info: http://www.cs.kent.ac.uk/~cr3/toolbox/haskell/#syb-utils Neil: It turned out to be tricky to recognize nested types at the Data/Typeable level, let alone nested types that really have an infinite set of potential substructure types (which are the ones that break the PlateData optimization). Instead, I just count nesting levels (where nesting means that we encounter the top-level type constructor while exploring its substructure types), and set an arbitrary bound beyond which I assume the nesting to be recursive and fall back to the unoptimized case. You might want to apply something similar to PlateData. Cheers, Claus From simonpj at microsoft.com Thu Aug 21 05:16:44 2008 From: simonpj at microsoft.com (Simon Peyton-Jones) Date: Thu Aug 21 05:15:39 2008 Subject: [Hs-Generics] Developing SYB, packaging issues In-Reply-To: <52f14b210808200523h37d5ae7em2cc5bf8a79df2b35@mail.gmail.com> References: <52f14b210808040449o4189511k1cbe6d430b6fb6da@mail.gmail.com> <3c6288ab0808050746p4b5bd85aqe2639318ef3a5585@mail.gmail.com> <20080805174009.GA1677@matrix.chaos.earth.li> <52f14b210808200523h37d5ae7em2cc5bf8a79df2b35@mail.gmail.com> Message-ID: <638ABD0A29C8884A91BC5FB5C349B1C32AE853E014@EA-EXMSG-C334.europe.corp.microsoft.com> I understand that with the proposed base package breakup [1], SYB will be moved to a separate package. But I still don't know how this will reflect on the development of the library. In particular: 1) Where is the source code going to be hosted? Here in Utrecht we currently have a repository with several (cabalized) generic programming libraries, SYB included. But maybe SYB will stay in the same repository as GHC? I don't think it matters too much where it's hosted. For us it might be convenient if it was on darcs.haskell.org because it reduces the number of ways in which you can get stuck. But servers are fairly reliable so this probably isn't very important. 2) Can development proceed independently of GHC, i.e. can a new version of SYB be released without a new version of GHC? Yes, I think that independent development is part of the goal. The easiest way to achieve this is for SYB *not* to be a GHC "core package". That is, not needed to build GHC (or GHCi, or the GHC library). Then it's "just a library" like GtK or LibCurl, and you can upgrade it whenever you like. It's more complicated if it's a core package. For example, if the GHC API uses SYB to implement something, then package "ghc-6.9" will depend on package "SYB-2.7", and while you can also have SYB-3.2 installed the ghc-6.9 package will still use the "SYB-2.7". 3) How does the separation affect the automatic instance deriving mechanism? It think it'd make sense for the classes Data and Typable themselves to remain in a "core package", precisely because the deriving mechanism generates code for them. If you change the method signatures, the code has to change, for example. But all the library code layered on top can be in the SYB package. I hope I have this right! Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080821/556a99af/attachment-0001.htm From jpm at cs.uu.nl Thu Aug 21 10:14:25 2008 From: jpm at cs.uu.nl (=?ISO-8859-1?Q?Jos=E9_Pedro_Magalh=E3es?=) Date: Thu Aug 21 10:13:19 2008 Subject: [Hs-Generics] Developing SYB, packaging issues In-Reply-To: <638ABD0A29C8884A91BC5FB5C349B1C32AE853E014@EA-EXMSG-C334.europe.corp.microsoft.com> References: <52f14b210808040449o4189511k1cbe6d430b6fb6da@mail.gmail.com> <3c6288ab0808050746p4b5bd85aqe2639318ef3a5585@mail.gmail.com> <20080805174009.GA1677@matrix.chaos.earth.li> <52f14b210808200523h37d5ae7em2cc5bf8a79df2b35@mail.gmail.com> <638ABD0A29C8884A91BC5FB5C349B1C32AE853E014@EA-EXMSG-C334.europe.corp.microsoft.com> Message-ID: <52f14b210808210714u49b0b07erddf55abad6de6a9b@mail.gmail.com> Hello, Thanks for your answer. Replying below: On Thu, Aug 21, 2008 at 11:16, Simon Peyton-Jones wrote: > I understand that with the proposed base package breakup [1], SYB will > be moved to a separate package. But I still don't know how this will reflect > on the development of the library. In particular: > > 1) Where is the source code going to be hosted? Here in Utrecht we > currently have a repository with several (cabalized) generic programming > libraries, SYB included. But maybe SYB will stay in the same repository as > GHC? > > I don't think it matters too much where it's hosted. For us it might be > convenient if it was on darcs.haskell.org because it reduces the number of > ways in which you can get stuck. But servers are fairly reliable so this > probably isn't very important. > We might prefer to keep it in an SVN repository where we have other generic libraries, if that is not a big problem. If it is, it can always go to darcs.haskell.org anyway. > > 2) Can development proceed independently of GHC, i.e. can a new version of > SYB be released without a new version of GHC? > > Yes, I think that independent development is part of the goal. The > easiest way to achieve this is for SYB **not** to be a GHC "core package". > That is, not needed to build GHC (or GHCi, or the GHC library). Then it's > "just a library" like GtK or LibCurl, and you can upgrade it whenever you > like. > > It's more complicated if it's a core package. For example, if the GHC API > uses SYB to implement something, then package "ghc-6.9" will depend on > package "SYB-2.7", and while you can also have SYB-3.2 installed the ghc-6.9 > package will still use the "SYB-2.7". > I think indeed having it outside of the core is the best thing. > > 3) How does the separation affect the automatic instance deriving > mechanism? > > It think it'd make sense for the classes Data and Typable themselves to > remain in a "core package", precisely because the deriving mechanism > generates code for them. If you change the method signatures, the code has > to change, for example. But all the library code layered on top can be in > the SYB package. > Ok, that makes sense. Only any changes to methods in Data would need to wait for a new version of GHC. But those should be kept to a minimum, if any at all. It's just a pity that so many methods are inside the Data class (like gmapQ and friends). But then again, there is a reason for them to be there, and it's probably not a good idea to change those anyway. Most development should proceed by adding new things on top of the existing Data class core. What about instances of Data for the base types? Here I see a few possibilities: 1) No types have instances in core. Those could be in the SYB package, or the user could use stand-alone deriving to get them (if that is possible). 2) All types have instances in core, similar to the current Data.Generics.Instances situation. This implies that the situation discussed in [1] (inconvenient Data instances) will remain. 3) Something between the previous two, such as the 'standard' Data instances staying in core, and the others going to the SYB package (where they could be thought over, or separated into another module which is not imported by default). Pedro [1] http://www.haskell.org/pipermail/generics/2008-June/000347.html -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080821/874fa1b3/attachment.htm From simonpj at microsoft.com Thu Aug 21 12:41:24 2008 From: simonpj at microsoft.com (Simon Peyton-Jones) Date: Thu Aug 21 12:40:20 2008 Subject: [Hs-Generics] Developing SYB, packaging issues In-Reply-To: <52f14b210808210714u49b0b07erddf55abad6de6a9b@mail.gmail.com> References: <52f14b210808040449o4189511k1cbe6d430b6fb6da@mail.gmail.com> <3c6288ab0808050746p4b5bd85aqe2639318ef3a5585@mail.gmail.com> <20080805174009.GA1677@matrix.chaos.earth.li> <52f14b210808200523h37d5ae7em2cc5bf8a79df2b35@mail.gmail.com> <638ABD0A29C8884A91BC5FB5C349B1C32AE853E014@EA-EXMSG-C334.europe.corp.microsoft.com> <52f14b210808210714u49b0b07erddf55abad6de6a9b@mail.gmail.com> Message-ID: <638ABD0A29C8884A91BC5FB5C349B1C32AE85DFD55@EA-EXMSG-C334.europe.corp.microsoft.com> 1) Where is the source code going to be hosted? Here in Utrecht we currently have a repository with several (cabalized) generic programming libraries, SYB included. But maybe SYB will stay in the same repository as GHC? I don't think it matters too much where it's hosted. For us it might be convenient if it was on darcs.haskell.org because it reduces the number of ways in which you can get stuck. But servers are fairly reliable so this probably isn't very important. We might prefer to keep it in an SVN repository where we have other generic libraries, if that is not a big problem. If it is, it can always go to darcs.haskell.org anyway. I think we would very much prefer NOT to use SVN. We're already using darcs, and will shortly be using Git too. To have SVN too is a situation we'd like to avoid. Darcs or Git please! 3) How does the separation affect the automatic instance deriving mechanism? It think it'd make sense for the classes Data and Typable themselves to remain in a "core package", precisely because the deriving mechanism generates code for them. If you change the method signatures, the code has to change, for example. But all the library code layered on top can be in the SYB package. Ok, that makes sense. Only any changes to methods in Data would need to wait for a new version of GHC. But those should be kept to a minimum, if any at all. It's just a pity that so many methods are inside the Data class (like gmapQ and friends). But then again, there is a reason for them to be there, and it's probably not a good idea to change those anyway. Most development should proceed by adding new things on top of the existing Data class core. What about instances of Data for the base types? Here I see a few possibilities: 1) No types have instances in core. Those could be in the SYB package, or the user could use stand-alone deriving to get them (if that is possible). 2) All types have instances in core, similar to the current Data.Generics.Instances situation. This implies that the situation discussed in [1] (inconvenient Data instances) will remain. 3) Something between the previous two, such as the 'standard' Data instances staying in core, and the others going to the SYB package (where they could be thought over, or separated into another module which is not imported by default). I suspect it'd be better to have all the instances in the SYB package. If there are data types whose instances are sufficiently simple and unlikely to change that they can live in the same module as the Data class itself, then we can do that, but otherwise just put them in the package. S Pedro [1] http://www.haskell.org/pipermail/generics/2008-June/000347.html -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080821/28d43f96/attachment.htm From jbapple+haskell-lib at gmail.com Thu Aug 21 21:26:28 2008 From: jbapple+haskell-lib at gmail.com (Jim Apple) Date: Thu Aug 21 21:25:19 2008 Subject: #2532: Add Typeable instance to Data.Unique Message-ID: http://hackage.haskell.org/trac/ghc/ticket/2532 Discussion deadline: Sept 7 Jim From jbapple+haskell-lib at gmail.com Thu Aug 21 22:47:03 2008 From: jbapple+haskell-lib at gmail.com (Jim Apple) Date: Thu Aug 21 22:45:52 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts Message-ID: http://hackage.haskell.org/trac/ghc/ticket/2533 Deadline: Sept 7 The Prelude functions drop, take, and splitAt are unfailing (never call error). This patch changes the Data.List generic versions to behave the same way. At present, they call error on negative arguments. Jim From jbapple+haskell-lib at gmail.com Thu Aug 21 23:31:10 2008 From: jbapple+haskell-lib at gmail.com (Jim Apple) Date: Thu Aug 21 23:30:00 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: References: Message-ID: BTW, where can I put tests for Data.List functions? http://darcs.haskell.org/testsuite/tests/libraries/Data/test_List.hs has not been updated in almost two years, and includes only two good tests. Jim On Thu, Aug 21, 2008 at 7:47 PM, Jim Apple wrote: > http://hackage.haskell.org/trac/ghc/ticket/2533 > > Deadline: Sept 7 > > The Prelude functions drop, take, and splitAt are unfailing (never > call error). This patch changes the Data.List generic versions to > behave the same way. At present, they call error on negative > arguments. > > Jim > From patperry at stanford.edu Fri Aug 22 00:20:25 2008 From: patperry at stanford.edu (Patrick Perry) Date: Fri Aug 22 00:19:26 2008 Subject: Bugfix for QuickCheck 1.1.0.0 Message-ID: <4E804BD8-22DB-4EDF-AC29-900908603650@stanford.edu> Hi everyone, I've put in a proposal that fixes a bug in QuickCheck 1.1.0.0. http://hackage.haskell.org/trac/ghc/ticket/2535 Thanks, Patrick From ndmitchell at gmail.com Fri Aug 22 06:46:23 2008 From: ndmitchell at gmail.com (Neil Mitchell) Date: Fri Aug 22 06:45:12 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: References: Message-ID: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> Hi > The Prelude functions drop, take, and splitAt are unfailing (never > call error). This patch changes the Data.List generic versions to > behave the same way. At present, they call error on negative > arguments. I had always just assumed that take and genericTake did the same thing, so had never even realised this problem existed. I'd call this a bug, that needs fixing. (+1) Neil From lemming at henning-thielemann.de Fri Aug 22 11:49:22 2008 From: lemming at henning-thielemann.de (Henning Thielemann) Date: Fri Aug 22 11:48:12 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> Message-ID: On Fri, 22 Aug 2008, Neil Mitchell wrote: > Hi > >> The Prelude functions drop, take, and splitAt are unfailing (never >> call error). This patch changes the Data.List generic versions to >> behave the same way. At present, they call error on negative >> arguments. > > I had always just assumed that take and genericTake did the same > thing, so had never even realised this problem existed. I'd call this > a bug, that needs fixing. Maybe the bug is in 'drop', 'take' and 'splitAt' and it was intended to fix it in 'generic' variants. Is there a good reason why to ignore negative number arguments? It may hide bugs. From duncan.coutts at worc.ox.ac.uk Fri Aug 22 07:40:49 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Fri Aug 22 14:05:14 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> Message-ID: <1219405249.13639.369.camel@localhost> On Fri, 2008-08-22 at 11:46 +0100, Neil Mitchell wrote: > Hi > > > The Prelude functions drop, take, and splitAt are unfailing (never > > call error). This patch changes the Data.List generic versions to > > behave the same way. At present, they call error on negative > > arguments. > > I had always just assumed that take and genericTake did the same > thing, so had never even realised this problem existed. I'd call this > a bug, that needs fixing. I'd like to know the rationale for the difference in the first place, or hear from one of the spec authors that it was just an oversight. Duncan From alexander.dunlap at gmail.com Sat Aug 23 01:50:27 2008 From: alexander.dunlap at gmail.com (Alexander Dunlap) Date: Sat Aug 23 01:49:13 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> Message-ID: <57526e770808222250n359fe2f0j203745af8c951e9e@mail.gmail.com> On Fri, Aug 22, 2008 at 8:49 AM, Henning Thielemann wrote: > > On Fri, 22 Aug 2008, Neil Mitchell wrote: > >> Hi >> >>> The Prelude functions drop, take, and splitAt are unfailing (never >>> call error). This patch changes the Data.List generic versions to >>> behave the same way. At present, they call error on negative >>> arguments. >> >> I had always just assumed that take and genericTake did the same >> thing, so had never even realised this problem existed. I'd call this >> a bug, that needs fixing. > > Maybe the bug is in 'drop', 'take' and 'splitAt' and it was intended to fix > it in 'generic' variants. Is there a good reason why to ignore negative > number arguments? It may hide bugs. > _______________________________________________ > Libraries mailing list > Libraries@haskell.org > http://www.haskell.org/mailman/listinfo/libraries > I agree (although the Report doesn't). There's no reason for those functions to be called with negative arguments; all it will do is hide bugs. Alex From ndmitchell at gmail.com Sat Aug 23 05:29:43 2008 From: ndmitchell at gmail.com (Neil Mitchell) Date: Sat Aug 23 05:28:29 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: <57526e770808222250n359fe2f0j203745af8c951e9e@mail.gmail.com> References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> <57526e770808222250n359fe2f0j203745af8c951e9e@mail.gmail.com> Message-ID: <404396ef0808230229p77f77a09t4d689dfc00468be4@mail.gmail.com> Hi >>> I had always just assumed that take and genericTake did the same >>> thing, so had never even realised this problem existed. I'd call this >>> a bug, that needs fixing. >> >> Maybe the bug is in 'drop', 'take' and 'splitAt' and it was intended to fix >> it in 'generic' variants. Is there a good reason why to ignore negative >> number arguments? It may hide bugs. Too late. There is code depending on this behaviour in the wild, and we can't break it without a really really really good reason. Changing undefined to a value is not too bad (some optimisations in ByteString do it automatically even!), but changing a value to undefined is bad. Thanks Neil From igloo at earth.li Sat Aug 23 07:01:49 2008 From: igloo at earth.li (Ian Lynagh) Date: Sat Aug 23 07:00:36 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: References: Message-ID: <20080823110149.GA31503@matrix.chaos.earth.li> On Thu, Aug 21, 2008 at 07:47:03PM -0700, Jim Apple wrote: > > The Prelude functions drop, take, and splitAt are unfailing (never > call error). This patch changes the Data.List generic versions to > behave the same way. At present, they call error on negative > arguments. Looks like a bug/inconsistency in H98 to me: http://haskell.org/onlinereport/list.html#sect17.7 says The prefix "generic" indicates an overloaded function that is a generalised version of a Prelude function. so I'd certainly expect the generic versions to behave the same if used at type ...Int... Malcolm, do you agree? Can this go in the H98 errata? Thanks Ian From igloo at earth.li Sat Aug 23 07:03:18 2008 From: igloo at earth.li (Ian Lynagh) Date: Sat Aug 23 07:02:04 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: References: Message-ID: <20080823110318.GB31503@matrix.chaos.earth.li> On Thu, Aug 21, 2008 at 08:31:10PM -0700, Jim Apple wrote: > BTW, where can I put tests for Data.List functions? > > http://darcs.haskell.org/testsuite/tests/libraries/Data/test_List.hs > > has not been updated in almost two years, and includes only two good tests. Despite the name, the best place is tests/ghc-regress/lib/Data.List/ in the testsuite repo. Thanks Ian From kahl at cas.mcmaster.ca Sat Aug 23 13:33:55 2008 From: kahl at cas.mcmaster.ca (kahl@cas.mcmaster.ca) Date: Sat Aug 23 13:33:23 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: (message from Henning Thielemann on Fri, 22 Aug 2008 17:49:22 +0200 (CEST)) References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> Message-ID: <20080823173355.14895.qmail@schroeder.cas.mcmaster.ca> Neil Mitchell wrote: > > > Hi > > > >> The Prelude functions drop, take, and splitAt are unfailing (never > >> call error). This patch changes the Data.List generic versions to > >> behave the same way. At present, they call error on negative > >> arguments. > > > > I had always just assumed that take and genericTake did the same > > thing, so had never even realised this problem existed. I'd call this > > a bug, that needs fixing. > > Maybe the bug is in 'drop', 'take' and 'splitAt' and it was intended to > fix it in 'generic' variants. Is there a good reason why to ignore > negative number arguments? It may hide bugs. A similar argument could be made against ``take 5 [] = []''. A different solution would be using Nat or Natural as arguments here --- then the conversion introduces an obvious place to check for errors. Wolfram From gwern0 at gmail.com Sat Aug 23 15:46:02 2008 From: gwern0 at gmail.com (Gwern Branwen) Date: Sat Aug 23 15:45:58 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: <20080823173355.14895.qmail@schroeder.cas.mcmaster.ca> References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> <20080823173355.14895.qmail@schroeder.cas.mcmaster.ca> Message-ID: <20080823194602.GA11579@craft> On 2008.08.23 17:33:55 -0000, kahl@cas.mcmaster.ca scribbled 0.9K characters: > Neil Mitchell wrote: > > > > > Hi > > > > > >> The Prelude functions drop, take, and splitAt are unfailing (never > > >> call error). This patch changes the Data.List generic versions to > > >> behave the same way. At present, they call error on negative > > >> arguments. > > > > > > I had always just assumed that take and genericTake did the same > > > thing, so had never even realised this problem existed. I'd call this > > > a bug, that needs fixing. > > > > Maybe the bug is in 'drop', 'take' and 'splitAt' and it was intended to > > fix it in 'generic' variants. Is there a good reason why to ignore > > negative number arguments? It may hide bugs. > > A similar argument could be made against ``take 5 [] = []''. > > A different solution would be using Nat or Natural as arguments here --- > then the conversion introduces an obvious place to check for errors. > > Wolfram I've actually long wondered about this: why don't more functions use Nat where it'd make sense? It can't be because Nat is hard to define - I'd swear I've seen many definitions of Nat (if not dozens when you count all the type-level exercises which include one). -- gwern ISSSP SADT NSV Rachel HAMASMOIS & Lindows wire ASPIC clandestine -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://www.haskell.org/pipermail/libraries/attachments/20080823/580d07d9/attachment-0001.bin From allbery at ece.cmu.edu Sat Aug 23 15:48:10 2008 From: allbery at ece.cmu.edu (Brandon S. Allbery KF8NH) Date: Sat Aug 23 15:46:58 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: <20080823194602.GA11579@craft> References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> <20080823173355.14895.qmail@schroeder.cas.mcmaster.ca> <20080823194602.GA11579@craft> Message-ID: <4509E002-69EF-4F0B-839E-0CF08ABBF18A@ece.cmu.edu> On 2008 Aug 23, at 15:46, Gwern Branwen wrote: > I've actually long wondered about this: why don't more functions use > Nat where it'd make sense? It can't be because Nat is hard to define > - I'd swear I've seen many definitions of Nat (if not dozens when > you count all the type-level exercises which include one). Because naive definitions are dog-slow and fast definitions are anything but easy to use? -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH From leather at cs.uu.nl Sat Aug 23 19:51:32 2008 From: leather at cs.uu.nl (Sean Leather) Date: Sat Aug 23 19:50:17 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: <4509E002-69EF-4F0B-839E-0CF08ABBF18A@ece.cmu.edu> References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> <20080823173355.14895.qmail@schroeder.cas.mcmaster.ca> <20080823194602.GA11579@craft> <4509E002-69EF-4F0B-839E-0CF08ABBF18A@ece.cmu.edu> Message-ID: <3c6288ab0808231651t49241cd2j9f5013f52330b751@mail.gmail.com> Brandon S. Allbery wrote: > Gwern Branwen wrote: > >> I've actually long wondered about this: why don't more functions use Nat >> where it'd make sense? It can't be because Nat is hard to define - I'd swear >> I've seen many definitions of Nat (if not dozens when you count all the >> type-level exercises which include one). >> > > Because naive definitions are dog-slow and fast definitions are anything > but easy to use? > Can you examples of both naive definitions and fast definitions of Nat? I'm curious. Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080824/bccd2b27/attachment.htm From kahl at cas.mcmaster.ca Sun Aug 24 00:25:36 2008 From: kahl at cas.mcmaster.ca (kahl@cas.mcmaster.ca) Date: Sun Aug 24 00:24:59 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: <3c6288ab0808231651t49241cd2j9f5013f52330b751@mail.gmail.com> (leather@cs.uu.nl) References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> <20080823173355.14895.qmail@schroeder.cas.mcmaster.ca> <20080823194602.GA11579@craft> <4509E002-69EF-4F0B-839E-0CF08ABBF18A@ece.cmu.edu> <3c6288ab0808231651t49241cd2j9f5013f52330b751@mail.gmail.com> Message-ID: <20080824042536.15948.qmail@schroeder.cas.mcmaster.ca> > > Can you examples of both naive definitions and fast definitions of Nat? I'm > curious. Naive: > data Natural' = Zero | Succ Natural' Fast: > type Nat = Word64 (or Word if you want to retrofit to Haskell-1.5, a.k.a. Haskell98 ;-). > newtype Natural = N Integer -- to be exported abstractly > > instance Num Natural where > ... > N a - N b = let d = a - b in if d >= 0 then N d > else error "Illegal Natural subtraction" > -- one argument against Num ;-) > > subtract (N a) (N b) = let d = a - b in if d >= 0 then Just $ N d > else Nothing .... Wolfram From gwern0 at gmail.com Sun Aug 24 01:06:58 2008 From: gwern0 at gmail.com (Gwern Branwen) Date: Sun Aug 24 01:06:29 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: <4509E002-69EF-4F0B-839E-0CF08ABBF18A@ece.cmu.edu> References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> <20080823173355.14895.qmail@schroeder.cas.mcmaster.ca> <20080823194602.GA11579@craft> <4509E002-69EF-4F0B-839E-0CF08ABBF18A@ece.cmu.edu> Message-ID: <20080824050658.GB28982@craft> On 2008.08.23 15:48:10 -0400, "Brandon S. Allbery KF8NH" scribbled 0.5K characters: > On 2008 Aug 23, at 15:46, Gwern Branwen wrote: >> I've actually long wondered about this: why don't more functions use >> Nat where it'd make sense? It can't be because Nat is hard to define - >> I'd swear I've seen many definitions of Nat (if not dozens when you >> count all the type-level exercises which include one). > > Because naive definitions are dog-slow and fast definitions are anything > but easy to use? > > -- > brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com While I am but a mediocre Haskell programmer at best, I can't say I find that a satisfying explanation. When I read the GHC & fusion papers (among many many other fine papers relating to Haskell), I am impressed at the optimizations the authors managed to eek out despite the difficult conditions they labor under. With that in mind, I find it hard to accept that there is no approach which is fast and easy to use - no theorems or rewrite rules or library which provides it. When I consider it, it seems to me that it's not hard to define a Nat type which does checks for < 0 at runtime, so given the Olegian feats Haskell lends itself to, why is it not easy to staticly turn Nats into Ints or similarly performant types? -- gwern CESID Security Delta d DUVDEVAN SRI composition data-haven SONANGOL World -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://www.haskell.org/pipermail/libraries/attachments/20080824/15493410/attachment.bin From lemming at henning-thielemann.de Sun Aug 24 03:48:46 2008 From: lemming at henning-thielemann.de (Henning Thielemann) Date: Sun Aug 24 03:47:44 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: <20080823194602.GA11579@craft> References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> <20080823173355.14895.qmail@schroeder.cas.mcmaster.ca> <20080823194602.GA11579@craft> Message-ID: On Sat, 23 Aug 2008, Gwern Branwen wrote: >> A similar argument could be made against ``take 5 [] = []''. sure >> A different solution would be using Nat or Natural as arguments here --- >> then the conversion introduces an obvious place to check for errors. >> >> Wolfram > > I've actually long wondered about this: why don't more functions use Nat where it'd make sense? It can't be because Nat is hard to define - I'd swear I've seen many definitions of Nat (if not dozens when you count all the type-level exercises which include one). http://hackage.haskell.org/cgi-bin/hackage-scripts/package/non-negative From gale at sefer.org Sun Aug 24 05:11:00 2008 From: gale at sefer.org (Yitzchak Gale) Date: Sun Aug 24 05:09:43 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: <3c6288ab0808231651t49241cd2j9f5013f52330b751@mail.gmail.com> References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> <20080823173355.14895.qmail@schroeder.cas.mcmaster.ca> <20080823194602.GA11579@craft> <4509E002-69EF-4F0B-839E-0CF08ABBF18A@ece.cmu.edu> <3c6288ab0808231651t49241cd2j9f5013f52330b751@mail.gmail.com> Message-ID: <2608b8a80808240211i13cb4048ud20f8e1d2dd97a2a@mail.gmail.com> Sean Leather wrote: > Can you examples of both naive definitions and fast definitions of Nat? I'm > curious. Besides naive or fast, there is also lazy. So for example, using lazy naturals, the expression genericLength x < genericLength y only forces enough of x and y to determine which is longer. Here is John Meacham's implementation: http://www.haskell.org/pipermail/haskell-cafe/2007-October/033213.html Regards, Yitz From igloo at earth.li Sun Aug 24 08:57:33 2008 From: igloo at earth.li (Ian Lynagh) Date: Sun Aug 24 08:56:15 2008 Subject: [Hs-Generics] Developing SYB, packaging issues In-Reply-To: <638ABD0A29C8884A91BC5FB5C349B1C32AE853E014@EA-EXMSG-C334.europe.corp.microsoft.com> References: <52f14b210808040449o4189511k1cbe6d430b6fb6da@mail.gmail.com> <3c6288ab0808050746p4b5bd85aqe2639318ef3a5585@mail.gmail.com> <20080805174009.GA1677@matrix.chaos.earth.li> <52f14b210808200523h37d5ae7em2cc5bf8a79df2b35@mail.gmail.com> <638ABD0A29C8884A91BC5FB5C349B1C32AE853E014@EA-EXMSG-C334.europe.corp.microsoft.com> Message-ID: <20080824125733.GA6338@matrix.chaos.earth.li> On Thu, Aug 21, 2008 at 10:16:44AM +0100, Simon Peyton-Jones wrote: > > Yes, I think that independent development is part of the goal. The easiest way to achieve this is for SYB *not* to be a GHC "core package". Actually, base3-compat needs to re-export the SYB modules, as they were in base 3. So if base3-compat comes with GHC, then SYB needs to as well. It will still be possible to install newer versions of SYB, though. Thanks Ian From allbery at ece.cmu.edu Sun Aug 24 11:00:18 2008 From: allbery at ece.cmu.edu (Brandon S. Allbery KF8NH) Date: Sun Aug 24 10:59:04 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: <20080824050658.GB28982@craft> References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> <20080823173355.14895.qmail@schroeder.cas.mcmaster.ca> <20080823194602.GA11579@craft> <4509E002-69EF-4F0B-839E-0CF08ABBF18A@ece.cmu.edu> <20080824050658.GB28982@craft> Message-ID: <7F11EC66-8E5A-4E0C-BCF0-7B7BC47F910C@ece.cmu.edu> On 2008 Aug 24, at 1:06, Gwern Branwen wrote: > On 2008.08.23 15:48:10 -0400, "Brandon S. Allbery KF8NH" > scribbled 0.5K characters: >> On 2008 Aug 23, at 15:46, Gwern Branwen wrote: >>> I've actually long wondered about this: why don't more functions use >>> Nat where it'd make sense? It can't be because Nat is hard to >>> define - >>> I'd swear I've seen many definitions of Nat (if not dozens when you >>> count all the type-level exercises which include one). >> >> Because naive definitions are dog-slow and fast definitions are >> anything >> but easy to use? > While I am but a mediocre Haskell programmer at best, I can't say I > find that a satisfying explanation. When I read the GHC & fusion > papers (among many many other fine papers relating to Haskell), I am > impressed at the optimizations the authors managed to eek out > despite the difficult conditions they labor under. With that in > mind, I find it hard to accept that there is no approach which is > fast and easy to use - no theorems or rewrite rules or library which There are fast ones that are easy to use; they use Template Haskell, which isn't H98. As for why: it's not so much that its hard to optimize, it's that Haskell doesn't lend itself *conveniently* to type-level programming (which this is), so you end up resorting to TH or having rather ugly types all over the place. (Optimizing the naive case... you'd have to ask the GHC folks, but I suspect while they could optimize for it, it'd be a sufficiently narrow optimization that they'd want to see enough use to justify it.) -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH From leather at cs.uu.nl Sun Aug 24 16:07:52 2008 From: leather at cs.uu.nl (Sean Leather) Date: Sun Aug 24 16:06:33 2008 Subject: Interleave two lists Message-ID: <3c6288ab0808241307m7e09a4f0m6b130e946922f6dd@mail.gmail.com> Mark Jones gave several talks at Advanced Functional Programming 2008. In one of them, he presented approaches to enumerating the elements of various datatypes. I found several interesting things in it, but one function stuck out as being perhaps useful in general. infixr 5 ||| (|||) :: [a] -> [a] -> [a] [] ||| ys = ys (x:xs) ||| ys = x : ys ||| xs It interleaves the elements of two lists. It's defined exactly as (++) with the exception that the arguments are swapped for the recursive application. This works nicely when one wants to merge infinite lists. For example, suppose you want a list of the enumerable numbers with a balance of positive and negative: enums :: (Num a, Enum a) => [a] enums = [0..] ||| map negate [1..] You can't use (++) here, because the left side never completes. Does (|||) seem useful to others? Is it already available in some other form (or in a library) of which I'm not aware? If yes and no are the answers, then I wonder if it's useful enough for Data.List (modulo any expected renaming). Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080824/50e9ffb9/attachment-0001.htm From ahey at iee.org Sun Aug 24 17:17:41 2008 From: ahey at iee.org (Adrian Hey) Date: Sun Aug 24 17:16:24 2008 Subject: Interleave two lists In-Reply-To: <3c6288ab0808241307m7e09a4f0m6b130e946922f6dd@mail.gmail.com> References: <3c6288ab0808241307m7e09a4f0m6b130e946922f6dd@mail.gmail.com> Message-ID: <48B1CFF5.6080803@iee.org> Sean Leather wrote: > Mark Jones gave several talks at Advanced Functional Programming 2008. In > one of them, he presented approaches to enumerating the elements of various > datatypes. I found several interesting things in it, but one function stuck > out as being perhaps useful in general. > > infixr 5 ||| > > (|||) :: [a] -> [a] -> [a] > [] ||| ys = ys > (x:xs) ||| ys = x : ys ||| xs > > It interleaves the elements of two lists. It's defined exactly as (++) with > the exception that the arguments are swapped for the recursive application. > This works nicely when one wants to merge infinite lists. For example, > suppose you want a list of the enumerable numbers with a balance of positive > and negative: > > enums :: (Num a, Enum a) => [a] > enums = [0..] ||| map negate [1..] > > You can't use (++) here, because the left side never completes. > > Does (|||) seem useful to others? Is it already available in some other form > (or in a library) of which I'm not aware? If yes and no are the answers, > then I wonder if it's useful enough for Data.List (modulo any expected > renaming). It seems like it would be difficult to extend this definition (to interleave 3 or more lists) in a fair manner. You could probably do better using a revolving (Data.)sequence of lists. Just an idea.. Regards -- Adrian Hey From dave at zednenem.com Sun Aug 24 20:40:03 2008 From: dave at zednenem.com (David Menendez) Date: Sun Aug 24 20:38:49 2008 Subject: Interleave two lists In-Reply-To: <3c6288ab0808241307m7e09a4f0m6b130e946922f6dd@mail.gmail.com> References: <3c6288ab0808241307m7e09a4f0m6b130e946922f6dd@mail.gmail.com> Message-ID: <49a77b7a0808241740h7361c082q2f8485bf9d598806@mail.gmail.com> 2008/8/24 Sean Leather : > Does (|||) seem useful to others? Is it already available in some other form > (or in a library) of which I'm not aware? If yes and no are the answers, > then I wonder if it's useful enough for Data.List (modulo any expected > renaming). There's a generalized version of (|||) called "interleave" defined in the paper "Backtracking, Interleaving, and Terminating Monad Transformers" The logict package has an implementation In short, logict provides a class MonadLogic whose members support interleave, among other functions: class (MonadPlus m) => MonadLogic m where msplit :: m a -> m (Maybe (a, m a)) ... interleave :: m a -> m a -> m a interleave a b = msplit a >>= maybe b (\(x,a') -> return x `mplus` interleave b a') instance MonadLogic [] where msplit [] = [Nothing] msplit (x:xs) = [Just (x,xs)] The main disadvantage interleave and (|||) have, compared to mplus and (++), is that they aren't associative. -- Dave Menendez From john at repetae.net Mon Aug 25 21:54:34 2008 From: john at repetae.net (John Meacham) Date: Mon Aug 25 21:53:10 2008 Subject: Interleave two lists In-Reply-To: <3c6288ab0808241307m7e09a4f0m6b130e946922f6dd@mail.gmail.com> References: <3c6288ab0808241307m7e09a4f0m6b130e946922f6dd@mail.gmail.com> Message-ID: <20080826015434.GA15616@sliver.repetae.net> I have seen this operator spelled (/\/) in a few places. Usually in the definition of a non-space-leaking powerset generating function. John -- John Meacham - ?repetae.net?john? From ahey at iee.org Tue Aug 26 03:38:45 2008 From: ahey at iee.org (Adrian Hey) Date: Tue Aug 26 03:37:23 2008 Subject: Performance horrors Message-ID: <48B3B305.2040907@iee.org> Hello Folks, I was looking at the definitions of nub (and hence nubBy) recently in connection with a trac ticket and realised that this is O(n^2) complexity! Ouch! I was going to say I sure hope nobody uses this in real programs, but I thought I'd better check first and I see that darcs seems to use it in a few places. Hmm.. How did we ever get stuff like this in the standard libs? I can only imagine this is relic from the days when Haskell was only used for research or pedagogical purposes only (not for real programs). Seeing as practically all Eq instances are also Ord instances, at the very least we could have O(n*(log n)) definitions for .. nub :: Ord a => [a] -> [a] nub = nubBy compare nubBy :: (a -> a -> Ordering) -> [a] -> [a] nubBy cmp xs ys = -- something using an AVL tree perhaps. ..or even better using the generalised trie stuff Jamie Brandon has been working on. Of course I'm not actually advocating this as it's probably bad policy to have a simple standard lib dependent on any complex non-standard lib. But if it just isn't possible to implement some functions with reasonable efficiency using simple definitions only then I think they really shouldn't be there at all. Instead we should make users "roll their own" using whatever complex non-standard lib they want. So could nub and nubBy be deprecated and an appropriate health warning added to the Haddock? In the mean time I think I'll put appropriate alternative definitions in the next AVL release and ask Jamie Brandon to consider doing the same in his generalised tries lib (GMap). Also, looking at a few other functions there like union(By) and intersection(By) make me quite suspicious. Maybe we need a thorough performance audit to get rid of implementations that are so insanely inefficient they should *never* be used. Regards -- Adrian Hey From simonpj at microsoft.com Tue Aug 26 05:20:09 2008 From: simonpj at microsoft.com (Simon Peyton-Jones) Date: Tue Aug 26 05:18:54 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: <20080824050658.GB28982@craft> References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> <20080823173355.14895.qmail@schroeder.cas.mcmaster.ca> <20080823194602.GA11579@craft> <4509E002-69EF-4F0B-839E-0CF08ABBF18A@ece.cmu.edu> <20080824050658.GB28982@craft> Message-ID: <638ABD0A29C8884A91BC5FB5C349B1C32AE85E0A0D@EA-EXMSG-C334.europe.corp.microsoft.com> | >> I've actually long wondered about this: why don't more functions use | >> Nat where it'd make sense? It can't be because Nat is hard to define - | >> I'd swear I've seen many definitions of Nat (if not dozens when you | >> count all the type-level exercises which include one). | > | > Because naive definitions are dog-slow and fast definitions are anything | > but easy to use? I doubt that even GHC is going to optimise data Nat = Z | S Nat into data Nat = N Int# (with appropriate checks) anytime soon. I think the main reason that the latter (which can easily be implemented as a library) is not more widely used is that it's tiresomely incompatible with functions that produce Ints. Also perhaps if length :: [a] -> Nat, then computing the difference between two lengths (length xs - length ys) could produce a runtime error. But these are all matters of taste, software engineering, and inertia (legacy code). It'd be a worthwhile exercise to try making a version of the standard libs using Nat where it's appropriate, and see how convenient or otherwise it turns out to be. Simon From ndmitchell at gmail.com Tue Aug 26 06:26:17 2008 From: ndmitchell at gmail.com (Neil Mitchell) Date: Tue Aug 26 06:24:53 2008 Subject: Performance horrors In-Reply-To: <48B3B305.2040907@iee.org> References: <48B3B305.2040907@iee.org> Message-ID: <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> Hi > I was looking at the definitions of nub (and hence nubBy) recently > in connection with a trac ticket and realised that this is O(n^2) > complexity! Ouch! Complexity theory plus the Eq context makes that inevitable. You can't have nub over Eq in anything less than O(n^2) > I was going to say I sure hope nobody uses this in real programs, > but I thought I'd better check first and I see that darcs seems to > use it in a few places. Hmm.. I use it all the time, its dead handy. There are 12 instances in Hoogle, for example. If profiling later shows it to be a problem, I'd fix it - but I can't ever actually remember that being the case. Premature optimisation is the root of all evil. > Seeing as practically all Eq instances are also Ord instances, at > the very least we could have O(n*(log n)) definitions for .. > > nub :: Ord a => [a] -> [a] > nub = nubBy compare > > nubBy :: (a -> a -> Ordering) -> [a] -> [a] > nubBy cmp xs ys = -- something using an AVL tree perhaps. > > ..or even better using the generalised trie stuff Jamie Brandon > has been working on. Having nubOrd is a great idea, I even proposed it previously, but people disagreed with me. Propose it and add it by all means. > So could nub and nubBy be deprecated and an appropriate health warning > added to the Haddock? No. They should say O(n^2) in the haddock documentation, but O(n^2) /= useless. Thanks Neil From Alistair.Bayley at invesco.com Tue Aug 26 07:09:05 2008 From: Alistair.Bayley at invesco.com (Bayley, Alistair) Date: Tue Aug 26 07:07:42 2008 Subject: Performance horrors In-Reply-To: <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> References: <48B3B305.2040907@iee.org> <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> Message-ID: <125EACD0CAE4D24ABDB4D148C4593DA9049E9541@GBLONXMB02.corp.amvescap.net> > From: libraries-bounces@haskell.org > [mailto:libraries-bounces@haskell.org] On Behalf Of Neil Mitchell > > Complexity theory plus the Eq context makes that inevitable. You can't > have nub over Eq in anything less than O(n^2) > > > I was going to say I sure hope nobody uses this in real programs, > > but I thought I'd better check first and I see that darcs seems to > > use it in a few places. Hmm.. > > > nub :: Ord a => [a] -> [a] > > nub = nubBy compare > > > Having nubOrd is a great idea, I even proposed it previously, but > people disagreed with me. Propose it and add it by all means. The name is... well, pessimal might be a bit strong, but few programmers would think to look for something called "nub". Personally, when I first looked for it I expected uniq or unique (because that's what the unix utility that does the same thing is called). Distinct (from SQL) is another name that occurred to me, but never nub... it's not even a synonym for unique: http://thesaurus.reference.com/browse/unique See the redefinition of nub as uniq here (which I assume is because John didn't know about nub): http://hackage.haskell.org/packages/archive/MissingH/1.0.0/doc/html/Data -List-Utils.html The folklore (such as it is) for uniq is that it is trivially defined like so (for lists): > uniq = map head . group . sort and so perhaps is not worthy of library inclusion? BTW, is this a suitably performant definition, or would we benefit from a lower-level implementation? Alistair ***************************************************************** Confidentiality Note: The information contained in this message, and any attachments, may contain confidential and/or privileged material. It is intended solely for the person(s) or entity to which it is addressed. Any review, retransmission, dissemination, or taking of any action in reliance upon this information by persons or entities other than the intended recipient(s) is prohibited. If you received this in error, please contact the sender and delete the material from any computer. ***************************************************************** From ndmitchell at gmail.com Tue Aug 26 07:13:04 2008 From: ndmitchell at gmail.com (Neil Mitchell) Date: Tue Aug 26 07:11:40 2008 Subject: Performance horrors In-Reply-To: <125EACD0CAE4D24ABDB4D148C4593DA9049E9541@GBLONXMB02.corp.amvescap.net> References: <48B3B305.2040907@iee.org> <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> <125EACD0CAE4D24ABDB4D148C4593DA9049E9541@GBLONXMB02.corp.amvescap.net> Message-ID: <404396ef0808260413qe7297afq35553ab75455beef@mail.gmail.com> Hi > The folklore (such as it is) for uniq is that it is trivially defined > like so (for lists): > >> uniq = map head . group . sort > > and so perhaps is not worthy of library inclusion? BTW, is this a > suitably performant definition, or would we benefit from a lower-level > implementation? A much better definition would use a Data.Set, then you get laziness and the order of elements is not permuted. Having a sortNub as well is a good idea though. Thanks Neil From ahey at iee.org Tue Aug 26 13:32:40 2008 From: ahey at iee.org (Adrian Hey) Date: Tue Aug 26 13:31:17 2008 Subject: Performance horrors In-Reply-To: <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> References: <48B3B305.2040907@iee.org> <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> Message-ID: <48B43E38.8020808@iee.org> Neil Mitchell wrote: > There are 12 instances in > Hoogle, for example. If profiling later shows it to be a problem, I'd > fix it - but I can't ever actually remember that being the case. > Premature optimisation is the root of all evil. I'm sure most of the community would agree with you, but I have to say that if the consequence of this philosophy is horrors like this in the standard libs, it should come as no surprise that Haskell has a reputation for being "slow". What else is lurking there I wonder? What's really bad is that the terrible performance isn't even documented. It may be obvious, but it should still be documented. Has anybody even the remotest clue why darcs is (apparently) so slow? Maybe the community itself should share some of the blame for this. Like it wasn't obvious to me that the uses of nub I saw in darcs could rely on very short lists (<=1 element :-) > Having nubOrd is a great idea, I even proposed it previously, but > people disagreed with me. Propose it and add it by all means. Like I said, I'm not proposing it, as it doesn't seem to possible to implement it efficiently using anything else in the standard libs. You could do nubOrd (but not nubOrdBy) using Data.Set. But there are 2 problems.. 1- This not only introduces a cyclic dependency between modules, but also packages. I'm not sure how well various compilers and cabal would deal with this between them, but I'm not optimistic. 2- Data.Set is not obviously the best underlying implementation (in fact it is obviously not the best underlying implementation, this and Data.Map etc really should be pensioned off to hackage along with the rest of the badly documented, unreliable, inefficient and unstable junk :-) So I still think they should be deprecated. It seems like the least bad option if we can agree that their use should be strongly discouraged. >> So could nub and nubBy be deprecated and an appropriate health warning >> added to the Haddock? > > No. They should say O(n^2) in the haddock documentation, but O(n^2) /= useless. I would say it is if there are many obvious O(n.(log n)) or better implementations that can be used in in 99%+ of cases. I mean so obvious that users can quite easily roll their own in 3 lines of code or less. Regards -- Adrian Hey From ndmitchell at gmail.com Tue Aug 26 14:11:40 2008 From: ndmitchell at gmail.com (Neil Mitchell) Date: Tue Aug 26 14:10:24 2008 Subject: Performance horrors In-Reply-To: <48B43E38.8020808@iee.org> References: <48B3B305.2040907@iee.org> <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> <48B43E38.8020808@iee.org> Message-ID: <404396ef0808261111q54c5b2cfjb003b5e5d25ada65@mail.gmail.com> Hi > Has anybody even the remotest clue why darcs is (apparently) so slow? > Maybe the community itself should share some of the blame for this. Like > it wasn't obvious to me that the uses of nub I saw in darcs could rely > on very short lists (<=1 element :-) I'd be very surprised if it has anything to do with nub! > You could do nubOrd (but not nubOrdBy) using Data.Set. You can do nubOrdBy, just use the data type: data Elem a = Elem (a -> a -> Ordering) a > 1- This not only introduces a cyclic dependency between modules, but > also packages. I'm not sure how well various compilers and cabal would > deal with this between them, but I'm not optimistic. Yep, that would be a pain. > 2- Data.Set is not obviously the best underlying implementation (in > fact it is obviously not the best underlying implementation, this and > Data.Map etc really should be pensioned off to hackage along with the > rest of the badly documented, unreliable, inefficient and unstable > junk :-) Data.Set is an interface, which you seem to think is not the fastest implementation of that interface. If you can expose the same interface, but improve the performance, and demonstrate that the performance is better, then go for it! I'd support such a library change proposal. > So I still think they should be deprecated. It seems like the least bad > option if we can agree that their use should be strongly discouraged. Nope. I love nub. It's a beautiful function, which does exactly what it says on the tin (albeit the tin is labelled in Latin). > I would say it is if there are many obvious O(n.(log n)) or better > implementations that can be used in in 99%+ of cases. I mean so > obvious that users can quite easily roll their own in 3 lines > of code or less. None of them are Eq a => [a] -> [a]. If designing things from scratch I'd say its fine to have a debate about whether you make nub require Eq or Ord, and then provide the other as an option. But that time has passed. Thanks Neil From jeremy at n-heptane.com Tue Aug 26 15:24:32 2008 From: jeremy at n-heptane.com (Jeremy Shaw) Date: Tue Aug 26 15:18:07 2008 Subject: Performance horrors In-Reply-To: <48B3B305.2040907@iee.org> References: <48B3B305.2040907@iee.org> Message-ID: <87bpzffukv.wl%jeremy@n-heptane.com> At Tue, 26 Aug 2008 08:38:45 +0100, Adrian Hey wrote: > I was looking at the definitions of nub (and hence nubBy) recently > in connection with a trac ticket and realised that this is O(n^2) > complexity! Ouch! Can we modify Data.List to include the big-O notation for all the functions similar to Data.Set, Data.Map, and bytestring? j. From jbapple+haskell-lib at gmail.com Tue Aug 26 15:45:26 2008 From: jbapple+haskell-lib at gmail.com (Jim Apple) Date: Tue Aug 26 15:43:59 2008 Subject: Historical question about type of Data.List.group Message-ID: Why isn't the type of group Eq a => [a] -> [(a,[a])] That matches more exactly what group does, and it's easy to see that functions like nubOrd = map fst . group . sort are clearly safe, whereas map head . group . sort is not. Jim From gwern0 at gmail.com Tue Aug 26 15:45:14 2008 From: gwern0 at gmail.com (Gwern Branwen) Date: Tue Aug 26 15:45:10 2008 Subject: Performance horrors In-Reply-To: <125EACD0CAE4D24ABDB4D148C4593DA9049E9541@GBLONXMB02.corp.amvescap.net> References: <48B3B305.2040907@iee.org> <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> <125EACD0CAE4D24ABDB4D148C4593DA9049E9541@GBLONXMB02.corp.amvescap.net> Message-ID: <20080826194514.GB11435@craft> On 2008.08.26 12:09:05 +0100, "Bayley, Alistair" scribbled 1.5K characters: > The name is... well, pessimal might be a bit strong, but few programmers > would think to look for something called "nub". Personally, when I first > looked for it I expected uniq or unique (because that's what the unix > utility that does the same thing is called). Distinct (from SQL) is > another name that occurred to me, but never nub... it's not even a > synonym for unique: http://thesaurus.reference.com/browse/unique > > See the redefinition of nub as uniq here (which I assume is because John > didn't know about nub): > http://hackage.haskell.org/packages/archive/MissingH/1.0.0/doc/html/Data > -List-Utils.html > > The folklore (such as it is) for uniq is that it is trivially defined > like so (for lists): > > > uniq = map head . group . sort > > and so perhaps is not worthy of library inclusion? BTW, is this a > suitably performant definition, or would we benefit from a lower-level > implementation? > > Alistair FWIW, when I tested out a number of nub/uniq definitions, the map head version performed the worst, much worse than the O(n^2) one everyone is complaining about. It's a neat definition, but not performant. -- gwern Iraq Hercules Bosnia Summercon Compsec 20 Albright EuroFed RDI encryption -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://www.haskell.org/pipermail/libraries/attachments/20080826/3e9938da/attachment.bin From waldmann at imn.htwk-leipzig.de Tue Aug 26 17:07:55 2008 From: waldmann at imn.htwk-leipzig.de (Johannes Waldmann) Date: Tue Aug 26 17:06:34 2008 Subject: Historical question about type of Data.List.group In-Reply-To: References: Message-ID: <48B470AB.4010102@imn.htwk-leipzig.de> Jim Apple wrote: > Why isn't the type of group :: Eq a => [a] -> [(a,[a])] You mean that the actual type Eq a => [a] -> [[a]] does not give the information that each element of the result is non-empty. Perhaps this word "non-empty" could be inserted in the specification: > The group function takes a list and returns a list of *non-empty* lists > such that the concatenation of the result is equal to the argument. [...] best regards, J.W. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 257 bytes Desc: OpenPGP digital signature Url : http://www.haskell.org/pipermail/libraries/attachments/20080826/2d8921b3/signature.bin From ahey at iee.org Tue Aug 26 17:52:57 2008 From: ahey at iee.org (Adrian Hey) Date: Tue Aug 26 17:51:35 2008 Subject: Performance horrors In-Reply-To: <20080826194514.GB11435@craft> References: <48B3B305.2040907@iee.org> <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> <125EACD0CAE4D24ABDB4D148C4593DA9049E9541@GBLONXMB02.corp.amvescap.net> <20080826194514.GB11435@craft> Message-ID: <48B47B39.2000309@iee.org> Gwern Branwen wrote: > FWIW, when I tested out a number of nub/uniq definitions, the map head version performed the worst, much worse than the O(n^2) one everyone is complaining about. It's a neat definition, but not performant. When did you try this? IIRC correctly even the old sort was O(n^2), but someone had the sense to replace it a while ago. With ghci now on my machine.. length $ map head $ group $ sort [1..100000] finishes "instantly", but.. length $ nub [1..100000] takes about 90 seconds. Regards -- Adrian Hey From john at repetae.net Tue Aug 26 19:00:38 2008 From: john at repetae.net (John Meacham) Date: Tue Aug 26 18:59:11 2008 Subject: Performance horrors In-Reply-To: <48B3B305.2040907@iee.org> References: <48B3B305.2040907@iee.org> Message-ID: <20080826230038.GD15616@sliver.repetae.net> nub has a couple advantages over the n log n version, the main one being that it only requires an 'Eq' constraint, not an 'Ord' one. another being that it is fully lazy, it produces results even for an infinite list. a third is that the results come out in the order they went in. That said, I have a 'snub' (sorted nub) routine I use pretty commonly as well defined in the standard way. If you have something like setnub xs = f Set.empty xs where f _ [] = [] f s (x:xs) = if x `Set.member` s then f s xs else f (Set.insert x xs) (x:xs) then you can use 'RULES' pragmas to replace nub with setnub when it is allowed. Though, I have reservations about using RULES to change the order of complexity. John -- John Meacham - ?repetae.net?john? From jbapple+haskell-lib at gmail.com Tue Aug 26 19:17:53 2008 From: jbapple+haskell-lib at gmail.com (Jim Apple) Date: Tue Aug 26 19:16:27 2008 Subject: Historical question about type of Data.List.group In-Reply-To: References: Message-ID: This is also the case for Data.List.insert; it always returns a non-empty list, and so could have the signature insert :: Ord a => a -> [a] -> (a,[a]) Jim From duncan.coutts at worc.ox.ac.uk Tue Aug 26 19:56:32 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Tue Aug 26 19:54:30 2008 Subject: Performance horrors In-Reply-To: <48B47B39.2000309@iee.org> References: <48B3B305.2040907@iee.org> <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> <125EACD0CAE4D24ABDB4D148C4593DA9049E9541@GBLONXMB02.corp.amvescap.net> <20080826194514.GB11435@craft> <48B47B39.2000309@iee.org> Message-ID: <1219794992.24846.84.camel@localhost> On Tue, 2008-08-26 at 22:52 +0100, Adrian Hey wrote: > Gwern Branwen wrote: > > FWIW, when I tested out a number of nub/uniq definitions, the map head version performed the worst, much worse than the O(n^2) one everyone is complaining about. It's a neat definition, but not performant. > > When did you try this? IIRC correctly even the old sort was O(n^2), but > someone had the sense to replace it a while ago. > > With ghci now on my machine.. > length $ map head $ group $ sort [1..100000] > finishes "instantly", but.. > length $ nub [1..100000] > takes about 90 seconds. Also, sorting followed by grouping is unnecessary extra work. Sorts that discard duplicates are usually simple modifications of the sort algorithm. Though as people have pointed out, nub is nice because it is lazy, so sorting is out. An ord based nub should accumulate previously seen values so that it can operate lazily too. Duncan From ahey at iee.org Tue Aug 26 20:30:34 2008 From: ahey at iee.org (Adrian Hey) Date: Tue Aug 26 20:29:11 2008 Subject: Performance horrors In-Reply-To: <20080826230038.GD15616@sliver.repetae.net> References: <48B3B305.2040907@iee.org> <20080826230038.GD15616@sliver.repetae.net> Message-ID: <48B4A02A.8080601@iee.org> John Meacham wrote: > nub has a couple advantages over the n log n version, the main one being > that it only requires an 'Eq' constraint, not an 'Ord' one. This is only an advantage for a tiny minority of types that have Eq but no Ord instances, such as TypRep. But there's no good reason why even this cannot have an Ord instance AFAICS (though not a derived one obviously). > another > being that it is fully lazy, it produces results even for an infinite > list. As does the O(n*(log n)) AVL based nub I just wrote. > a third is that the results come out in the order they went in. As does the O(n*(log n)) AVL based nub I just wrote :-) Regards -- Adrian Hey From john at repetae.net Tue Aug 26 20:57:42 2008 From: john at repetae.net (John Meacham) Date: Tue Aug 26 20:56:15 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> Message-ID: <20080827005741.GK15616@sliver.repetae.net> On Fri, Aug 22, 2008 at 05:49:22PM +0200, Henning Thielemann wrote: >>> The Prelude functions drop, take, and splitAt are unfailing (never >>> call error). This patch changes the Data.List generic versions to >>> behave the same way. At present, they call error on negative >>> arguments. >> >> I had always just assumed that take and genericTake did the same >> thing, so had never even realised this problem existed. I'd call this >> a bug, that needs fixing. > > Maybe the bug is in 'drop', 'take' and 'splitAt' and it was intended to > fix it in 'generic' variants. Is there a good reason why to ignore > negative number arguments? It may hide bugs. But is also makes the functions less useful and another source of non-completeness. I certainly always considered accepting negative numbers a feature of those functions and not an infelicity. Sometimes you want 'drop 1 xs' and other times you want 'tail xs' (or equivalent). John -- John Meacham - ?repetae.net?john? From lemming at henning-thielemann.de Wed Aug 27 03:19:29 2008 From: lemming at henning-thielemann.de (Henning Thielemann) Date: Wed Aug 27 03:18:03 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: <20080827005741.GK15616@sliver.repetae.net> References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> <20080827005741.GK15616@sliver.repetae.net> Message-ID: On Tue, 26 Aug 2008, John Meacham wrote: > But is also makes the functions less useful and another source of > non-completeness. I certainly always considered accepting negative > numbers a feature of those functions and not an infelicity. Sometimes > you want 'drop 1 xs' and other times you want 'tail xs' (or equivalent). It's very confusing for readers of your programs, if you use 'drop 1' instead of 'tail'. The names 'drop' and 'tail' don't give the reader a hint, that 'drop' works for empty lists and 'tail' doesn't. 'drop 1' and 'tail' should behave identically for empty lists and a function with different behaviour should have a different name. From ahey at iee.org Wed Aug 27 04:05:41 2008 From: ahey at iee.org (Adrian Hey) Date: Wed Aug 27 04:04:18 2008 Subject: Performance horrors In-Reply-To: <87bpzffukv.wl%jeremy@n-heptane.com> References: <48B3B305.2040907@iee.org> <87bpzffukv.wl%jeremy@n-heptane.com> Message-ID: <48B50AD5.5080507@iee.org> Jeremy Shaw wrote: > At Tue, 26 Aug 2008 08:38:45 +0100, > Adrian Hey wrote: > >> I was looking at the definitions of nub (and hence nubBy) recently >> in connection with a trac ticket and realised that this is O(n^2) >> complexity! Ouch! > > Can we modify Data.List to include the big-O notation for all the > functions similar to Data.Set, Data.Map, and bytestring? I'm sure this is possible. I have no idea how to do this though. Perhaps somone else can explain. But patches submitted by ordinary users (even bug fixes) tend to languish in obscurity for some reason. I don't even know where the source code is or if we're using darcs or git or what these days (but I dare say I could find out if I was sufficiently motivated :-). Regards -- Adrian Hey From johan.tibell at gmail.com Wed Aug 27 08:57:43 2008 From: johan.tibell at gmail.com (Johan Tibell) Date: Wed Aug 27 08:56:14 2008 Subject: Performance horrors In-Reply-To: <404396ef0808261111q54c5b2cfjb003b5e5d25ada65@mail.gmail.com> References: <48B3B305.2040907@iee.org> <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> <48B43E38.8020808@iee.org> <404396ef0808261111q54c5b2cfjb003b5e5d25ada65@mail.gmail.com> Message-ID: <90889fe70808270557j152a567ftcf0ae201d081749d@mail.gmail.com> On Tue, Aug 26, 2008 at 11:11 AM, Neil Mitchell wrote: >> 2- Data.Set is not obviously the best underlying implementation (in >> fact it is obviously not the best underlying implementation, this and >> Data.Map etc really should be pensioned off to hackage along with the >> rest of the badly documented, unreliable, inefficient and unstable >> junk :-) > > Data.Set is an interface, which you seem to think is not the fastest > implementation of that interface. If you can expose the same > interface, but improve the performance, and demonstrate that the > performance is better, then go for it! I'd support such a library > change proposal. The problem with Data.Set and other collection types is that they're *not* defined as interfaces. If Data.Set was defined using a type class and maybe some associated data types it would be easier to provide an alternative implementation. Cheers, Johan From dmhouse at gmail.com Wed Aug 27 11:48:41 2008 From: dmhouse at gmail.com (David House) Date: Wed Aug 27 11:47:12 2008 Subject: Performance horrors In-Reply-To: <48B4A02A.8080601@iee.org> References: <48B3B305.2040907@iee.org> <20080826230038.GD15616@sliver.repetae.net> <48B4A02A.8080601@iee.org> Message-ID: 2008/8/27 Adrian Hey : > As does the O(n*(log n)) AVL based nub I just wrote. Care to share? WDYT about using RULES to rewrite to nubOrd if an Ord context is available, as John Meacham mentioned? John: you said you were uneasy about changing the complexity of an algorithm using RULES, but this is exactly what list fusion does (albeit for space, not time). -- -David From jonathanccast at fastmail.fm Wed Aug 27 11:58:42 2008 From: jonathanccast at fastmail.fm (Jonathan Cast) Date: Wed Aug 27 12:00:52 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> <20080827005741.GK15616@sliver.repetae.net> Message-ID: <1219852722.24685.3.camel@jcchost> On Wed, 2008-08-27 at 09:19 +0200, Henning Thielemann wrote: > On Tue, 26 Aug 2008, John Meacham wrote: > > > But is also makes the functions less useful and another source of > > non-completeness. I certainly always considered accepting negative > > numbers a feature of those functions and not an infelicity. Sometimes > > you want 'drop 1 xs' and other times you want 'tail xs' (or equivalent). > > It's very confusing for readers of your programs, if you use 'drop 1' > instead of 'tail'. The names 'drop' and 'tail' don't give the reader a > hint, that 'drop' works for empty lists and 'tail' doesn't. I doubt very much that the name `drop' means anything until you learn it. In particular, it's meaningless for native speakers of Chinese who haven't learned English, as well. And it's only slightly less meaningless (when you consider what it does) for native speakers of English. jcc From daveroundy at gmail.com Wed Aug 27 12:54:26 2008 From: daveroundy at gmail.com (David Roundy) Date: Wed Aug 27 12:52:57 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> <20080827005741.GK15616@sliver.repetae.net> Message-ID: <117f2cc80808270954w62311818h6ad60b2422aafc44@mail.gmail.com> On Wed, Aug 27, 2008 at 3:19 AM, Henning Thielemann wrote: > On Tue, 26 Aug 2008, John Meacham wrote: >> But is also makes the functions less useful and another source of >> non-completeness. I certainly always considered accepting negative >> numbers a feature of those functions and not an infelicity. Sometimes >> you want 'drop 1 xs' and other times you want 'tail xs' (or equivalent). > > It's very confusing for readers of your programs, if you use 'drop 1' > instead of 'tail'. The names 'drop' and 'tail' don't give the reader a hint, > that 'drop' works for empty lists and 'tail' doesn't. 'drop 1' and 'tail' > should behave identically for empty lists and a function with different > behaviour should have a different name. Personally, I'd prefer to see tail dropped from the Prelude (not any time soon, of course). The fewer incomplete functions are in the Prelude, the better, and drop has very nice and easy-to-understand behavior. Folks who want to crash on empty lists should write their own or use pattern matching. David From john at repetae.net Wed Aug 27 16:00:44 2008 From: john at repetae.net (John Meacham) Date: Wed Aug 27 15:59:15 2008 Subject: Performance horrors In-Reply-To: References: <48B3B305.2040907@iee.org> <20080826230038.GD15616@sliver.repetae.net> <48B4A02A.8080601@iee.org> Message-ID: <20080827200044.GT15616@sliver.repetae.net> On Wed, Aug 27, 2008 at 04:48:41PM +0100, David House wrote: > 2008/8/27 Adrian Hey : > > As does the O(n*(log n)) AVL based nub I just wrote. > > Care to share? > > WDYT about using RULES to rewrite to nubOrd if an Ord context is > available, as John Meacham mentioned? > > John: you said you were uneasy about changing the complexity of an > algorithm using RULES, but this is exactly what list fusion does > (albeit for space, not time). Indeed. and I am a little uneasy about that too :) not that I don't think it is an excellent idea and reap the benefits of fast list operations.. O(n^2) to O(n log n) just feels like a bigger jump to me for some reason. I think in the end, the RULES are a good idea, but I personally try to make which one I am using explicit when possible. Hence making my ideas as to the time/space requirements for an operation to both the compiler and a future reader of the code. John -- John Meacham - ?repetae.net?john? From lane at downstairspeople.org Wed Aug 27 17:09:36 2008 From: lane at downstairspeople.org (Christopher Lane Hinson) Date: Wed Aug 27 17:08:08 2008 Subject: Performance horrors In-Reply-To: <20080827200044.GT15616@sliver.repetae.net> References: <48B3B305.2040907@iee.org> <20080826230038.GD15616@sliver.repetae.net> <48B4A02A.8080601@iee.org> <20080827200044.GT15616@sliver.repetae.net> Message-ID: >> WDYT about using RULES to rewrite to nubOrd if an Ord context is >> available, as John Meacham mentioned? >> >> John: you said you were uneasy about changing the complexity of an >> algorithm using RULES, but this is exactly what list fusion does >> (albeit for space, not time). > > Indeed. and I am a little uneasy about that too :) not that I don't > think it is an excellent idea and reap the benefits of fast list > operations.. O(n^2) to O(n log n) just feels like a bigger jump to me > for some reason. I think in the end, the RULES are a good idea, but I > personally try to make which one I am using explicit when possible. > Hence making my ideas as to the time/space requirements for an operation > to both the compiler and a future reader of the code Using RULES in this way could be a pessimization. I can't imagine a nubOrd that would ever be as performant as nub on very small data sets (two or three elements). --Lane From ahey at iee.org Wed Aug 27 17:23:17 2008 From: ahey at iee.org (Adrian Hey) Date: Wed Aug 27 17:21:52 2008 Subject: Performance horrors In-Reply-To: References: <48B3B305.2040907@iee.org> <20080826230038.GD15616@sliver.repetae.net> <48B4A02A.8080601@iee.org> <20080827200044.GT15616@sliver.repetae.net> Message-ID: <48B5C5C5.3060406@iee.org> Christopher Lane Hinson wrote: > >>> WDYT about using RULES to rewrite to nubOrd if an Ord context is >>> available, as John Meacham mentioned? >>> >>> John: you said you were uneasy about changing the complexity of an >>> algorithm using RULES, but this is exactly what list fusion does >>> (albeit for space, not time). >> >> Indeed. and I am a little uneasy about that too :) not that I don't >> think it is an excellent idea and reap the benefits of fast list >> operations.. O(n^2) to O(n log n) just feels like a bigger jump to me >> for some reason. I think in the end, the RULES are a good idea, but I >> personally try to make which one I am using explicit when possible. >> Hence making my ideas as to the time/space requirements for an operation >> to both the compiler and a future reader of the code > > Using RULES in this way could be a pessimization. I can't imagine a > nubOrd that would ever be as performant as nub on very small data sets > (two or three elements). Why not? comparison doesn't cost any more than equality testing and I think we can safely say that nubOrd will never require *more* comparisons than equality tests needed by nub Regards -- Adrian Hey From allbery at ece.cmu.edu Wed Aug 27 17:32:17 2008 From: allbery at ece.cmu.edu (Brandon S. Allbery KF8NH) Date: Wed Aug 27 17:30:52 2008 Subject: Performance horrors In-Reply-To: References: <48B3B305.2040907@iee.org> <20080826230038.GD15616@sliver.repetae.net> <48B4A02A.8080601@iee.org> <20080827200044.GT15616@sliver.repetae.net> Message-ID: On 2008 Aug 27, at 17:09, Christopher Lane Hinson wrote: >>> WDYT about using RULES to rewrite to nubOrd if an Ord context is >>> available, as John Meacham mentioned? >>> >>> John: you said you were uneasy about changing the complexity of an >>> algorithm using RULES, but this is exactly what list fusion does >>> (albeit for space, not time). >> >> Indeed. and I am a little uneasy about that too :) not that I don't >> think it is an excellent idea and reap the benefits of fast list >> operations.. O(n^2) to O(n log n) just feels like a bigger jump to me >> > Using RULES in this way could be a pessimization. I can't imagine a > nubOrd that would ever be as performant as nub on very small data > sets (two or three elements). I think if you're in a situation where the performance of nub on such a small dataset matters, you're already well into the realm of controlling everything manually to get performance. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH From ross at soi.city.ac.uk Wed Aug 27 18:51:21 2008 From: ross at soi.city.ac.uk (Ross Paterson) Date: Wed Aug 27 18:49:55 2008 Subject: Performance horrors In-Reply-To: <48B43E38.8020808@iee.org> References: <48B3B305.2040907@iee.org> <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> <48B43E38.8020808@iee.org> Message-ID: <20080827225121.GA9532@soi.city.ac.uk> On Tue, Aug 26, 2008 at 06:32:40PM +0100, Adrian Hey wrote: > 2- Data.Set is not obviously the best underlying implementation (in > fact it is obviously not the best underlying implementation, this and > Data.Map etc really should be pensioned off to hackage along with the > rest of the badly documented, unreliable, inefficient and unstable > junk :-) Do you have benchmarks to quantify how bad Data.Set and Data.Map are? From dons at galois.com Wed Aug 27 18:53:25 2008 From: dons at galois.com (Don Stewart) Date: Wed Aug 27 18:51:52 2008 Subject: Performance horrors In-Reply-To: <20080827225121.GA9532@soi.city.ac.uk> References: <48B3B305.2040907@iee.org> <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> <48B43E38.8020808@iee.org> <20080827225121.GA9532@soi.city.ac.uk> Message-ID: <20080827225325.GM7196@scytale.galois.com> ross: > On Tue, Aug 26, 2008 at 06:32:40PM +0100, Adrian Hey wrote: > > 2- Data.Set is not obviously the best underlying implementation (in > > fact it is obviously not the best underlying implementation, this and > > Data.Map etc really should be pensioned off to hackage along with the > > rest of the badly documented, unreliable, inefficient and unstable > > junk :-) > > Do you have benchmarks to quantify how bad Data.Set and Data.Map are? Yes, a clear replacement for just Data.Map , with numbers, even if prototyped, would be rather compelling. And not bogged down in too much other code. Numbers please, or at least a direction to work in! -- Don From ndmitchell at gmail.com Wed Aug 27 19:05:10 2008 From: ndmitchell at gmail.com (Neil Mitchell) Date: Wed Aug 27 19:03:40 2008 Subject: Performance horrors In-Reply-To: <20080827225325.GM7196@scytale.galois.com> References: <48B3B305.2040907@iee.org> <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> <48B43E38.8020808@iee.org> <20080827225121.GA9532@soi.city.ac.uk> <20080827225325.GM7196@scytale.galois.com> Message-ID: <404396ef0808271605w4bc88eech6422efeb8a4d6ecd@mail.gmail.com> Hi >> > Data.Map etc really should be pensioned off to hackage along with the >> > rest of the badly documented, unreliable, inefficient and unstable >> > junk :-) >> >> Do you have benchmarks to quantify how bad Data.Set and Data.Map are? I'm also curious about the badly documented, unreliable or unstable comments too - if any of those are true (and in my experience they aren't) then they should also be fixed. Inefficient is the least of the problems on that list. Thanks Neil From allbery at ece.cmu.edu Wed Aug 27 19:47:31 2008 From: allbery at ece.cmu.edu (Brandon S. Allbery KF8NH) Date: Wed Aug 27 19:46:08 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts In-Reply-To: <1219852722.24685.3.camel@jcchost> References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> <20080827005741.GK15616@sliver.repetae.net> <1219852722.24685.3.camel@jcchost> Message-ID: On 2008 Aug 27, at 11:58, Jonathan Cast wrote: > On Wed, 2008-08-27 at 09:19 +0200, Henning Thielemann wrote: >> On Tue, 26 Aug 2008, John Meacham wrote: >>> But is also makes the functions less useful and another source of >>> non-completeness. I certainly always considered accepting negative >>> numbers a feature of those functions and not an infelicity. >>> Sometimes >>> you want 'drop 1 xs' and other times you want 'tail xs' (or >>> equivalent). >> >> It's very confusing for readers of your programs, if you use 'drop 1' >> instead of 'tail'. The names 'drop' and 'tail' don't give the >> reader a >> hint, that 'drop' works for empty lists and 'tail' doesn't. > > I doubt very much that the name `drop' means anything until you learn > it. In particular, it's meaningless for native speakers of Chinese > who > haven't learned English, as well. And it's only slightly less > meaningless (when you consider what it does) for native speakers of > English. And for that matter, even to me (an experienced programmer) my first (admittedly not entirely off base, just mostly) thought is the Forth/ PostScript "drop". -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH From lane at downstairspeople.org Wed Aug 27 21:38:18 2008 From: lane at downstairspeople.org (Christopher Lane Hinson) Date: Wed Aug 27 21:36:51 2008 Subject: Performance horrors In-Reply-To: <48B5C5C5.3060406@iee.org> References: <48B3B305.2040907@iee.org> <20080826230038.GD15616@sliver.repetae.net> <48B4A02A.8080601@iee.org> <20080827200044.GT15616@sliver.repetae.net> <48B5C5C5.3060406@iee.org> Message-ID: On Wed, 27 Aug 2008, Adrian Hey wrote: >> Using RULES in this way could be a pessimization. I can't imagine a nubOrd >> that would ever be as performant as nub on very small data sets (two or >> three elements). > > Why not? comparison doesn't cost any more than equality testing and I > think we can safely say that nubOrd will never require *more* > comparisons than equality tests needed by nub. I think that you will have to do extra comparsons to build whatever ordered data structure you use, and extra work to keep it balanced or you will risk falling back on O(n^2) behavior, and feed the garbage collector a little bit more with discarded intermediate nodes. Whether this is a real problem or not is for empirical testing assuming anyone cares that much. I actually really like the RULES idea. There seemed to be an intuition that changing the complexity of a function using RULES is a bad idea, and I was putting forward a possible basis for that intution. A rule of thumb is that when you switch to an algorithm with a better complexity class, you pay at least a small price in the constant factor. --Lane From apfelmus at quantentunnel.de Thu Aug 28 03:49:45 2008 From: apfelmus at quantentunnel.de (apfelmus) Date: Thu Aug 28 03:48:27 2008 Subject: Interleave two lists In-Reply-To: <49a77b7a0808241740h7361c082q2f8485bf9d598806@mail.gmail.com> References: <3c6288ab0808241307m7e09a4f0m6b130e946922f6dd@mail.gmail.com> <49a77b7a0808241740h7361c082q2f8485bf9d598806@mail.gmail.com> Message-ID: David Menendez wrote: > > The main disadvantage interleave and (|||) have, compared to mplus and > (++), is that they aren't associative. In fact, no interleaving operator (|||) that works for infinite lists can be associative. Here's proof. Every such operator corresponds to a pair of injective functions f,g : N -> N N = natural numbers who map the indexes of the elements of the left (f) and right (g) list to indexes in the result. Their images are disjoint and complement each other, i.e. ? = f(N) /\ g(N) and N = f(N) \/ g(N) For example, (x:xs) ||| ys = x : ys ||| xs corresponds to the pair (\n->2*n, \n->2*n+1) because the left list xs is mapped to the even positions and the right list ys is mapped to the uneven positions. Now, consider interleaving three lists. The case as ||| (bs ||| cs) maps the elements of as , bs and cs as follows into the result: as f bs g . f cs g . g In the other case (as ||| bs) ||| cs the positions of the elements in the result lists are determined by applying the functions as f . f bs f . g cs g Hence, if (|||) were associative, we would have f = f . f f . g = g . f g . g = g and hence as f bs f . g cs g But f and g already fill the entire result, there would be no room for the bs in the result list, a contradiction. Thus, no (|||) that merges infinite lists and is associative exists. Regards, apfelmus From igloo at earth.li Thu Aug 28 07:12:32 2008 From: igloo at earth.li (Ian Lynagh) Date: Thu Aug 28 07:11:00 2008 Subject: The base library and GHC 6.10 Message-ID: <20080828111232.GA15998@matrix.chaos.earth.li> Hi all, We're trying to decide what to do with the base library for GHC 6.10, in terms of how much of it should be broken up into separate packages. Since the recent proposal about this, we may be rethinking what we want to do, and we would welcome your opinions. First, the motivation for splitting base up: It becomes possible to separately upgrade the parts, and makes it easier for different people to maintain different parts. It makes it easier to see what the hierarchy is, and to restructure the hierarchy, and work towards more of the code being shared between different Haskell implementations. Plus it means that people can't re-tangle the logically separate components, which is all too easy to do when you just have one huge package. It also means that packages are clearer about what they depend on. One possibility, which would be really cool, is to separate all the IO modules from the non-IO modules; between that and looking at the extensions used (e.g. TH and FFI) it would then be clear whether or not a library could do any IO. Of course, the Prelude is a hurdle for this goal. Also, GHC's current plan for the base library: http://hackage.haskell.org/trac/ghc/wiki/DarcsConversion#Planforlibraries essentially means forking base (as nhc98 would continue to use base in a darcs repo, while GHC would use it from a git repo, and there are no plans for any merging between these repos). Therefore any code that is to be shared between the implementations needs to not be in base, so from that point of view it would be good to pull out as much as possible. The main argument /against/ splitting base up is that at some point the dependencies of packages need to be updated to reflect the changes. However, GHC 6.10 will come with a base version 3, as well as the new base version 4, so the transition should be much smoother than the base 2 -> base 3 transition. Now, on with the proposed splitting. In the below, LoC stands for "Lines of Code". First the easy bit: The Data.Generics hierarchy is going to have a separate maintainer, and I think that everyone is agreed that it should be pulled out into an "syb package". I'll treat this as not part of base from here on. The only thing still being debated here is whether the Data class itself should remain in base or not. Some people believe that it should remain in base, as it is desirable to have Data instances for as many types as possible, and because there is a resistance among library writers against adding dependencies. The counter argument is that there are many other classes that the same is true of (e.g. uniplate, syb-with-class, binary), and it does not scale to put all of these classes into base. Also, by requiring a dep to be added even for the classes that have historically been included in base, adding dependencies for the sake of providing instances may become more socially acceptable. Now, on with the splitting. We have System.Console.GetOpt (129 LoC, 1 module) This doesn't really fit in with anything else in base, so the proposal is to split it off into its own getopt package. I don't think there is much objection to this one. Next we have the Control.Monad.ST Data.STRef (120 LoC, 6 modules) hierarchies. The proposal is to put these into an st package. The low-level implementation is still in base (69 LoC of in the GHC.ST and GHC.STRef), so to some extent this is a false separation. On the other hand, nhc98 doesn't support ST, so splitting this package off gets us closer to all implementations exposing the same modules from base. Then we have Control.Concurrent (490 LoC, 6 modules) hierarchy, along with System.Timeout (39 LoC) Data.Unique (32 LoC) (those modules depend on Control.Concurrent.*). The proposal is to put these into concurrent, timout and unique packages respectively. Again, this is a false separation, with 698 LoC left behind in GHC.Conc; at some time we'd hope that this could either be moved down to ghc-prim, or make a new ghc-concurrent package for it, depending on how the dependencies work out. Again, nhc doesn't support concurrent or its dependencies, so this gets us closer to a consistent base interface. Splitting off the above 5 packages would leave 106 modules and 16621 LoC in base. About 5% of the LoC, and 12.5% of the modules, would be in the new packages. Thanks Ian From jpm at cs.uu.nl Thu Aug 28 07:25:25 2008 From: jpm at cs.uu.nl (=?ISO-8859-1?Q?Jos=E9_Pedro_Magalh=E3es?=) Date: Thu Aug 28 07:23:56 2008 Subject: The base library and GHC 6.10 In-Reply-To: <20080828111232.GA15998@matrix.chaos.earth.li> References: <20080828111232.GA15998@matrix.chaos.earth.li> Message-ID: <52f14b210808280425s257405f4w9b9bd3878fafdee1@mail.gmail.com> Hello, On Thu, Aug 28, 2008 at 13:12, Ian Lynagh wrote: > > First the easy bit: The Data.Generics hierarchy is going to have a > separate maintainer, and I think that everyone is agreed that it should > be pulled out into an "syb package". I'll treat this as not part of base > from here on. > > The only thing still being debated here is whether the Data class itself > should remain in base or not. Some people believe that it should remain > in base, as it is desirable to have Data instances for as many types as > possible, and because there is a resistance among library writers > against adding dependencies. The counter argument is that there are many > other classes that the same is true of (e.g. uniplate, syb-with-class, > binary), and it does not scale to put all of these classes into base. > Also, by requiring a dep to be added even for the classes that have > historically been included in base, adding dependencies for the sake of > providing instances may become more socially acceptable. Is there a way not to have the Data class in base while still preserving the deriving mechanism? I think that one big reason for the popularity of SYB is not only the fact that it comes with GHC but also that you get support for generics on user-defined datatypes for "free". So if there is no way to have derivable Data with Data outside base, then I think Data should stay in base. Pedro -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080828/2a5857bf/attachment-0001.htm From ndmitchell at gmail.com Thu Aug 28 07:28:18 2008 From: ndmitchell at gmail.com (Neil Mitchell) Date: Thu Aug 28 07:26:47 2008 Subject: The base library and GHC 6.10 In-Reply-To: <20080828111232.GA15998@matrix.chaos.earth.li> References: <20080828111232.GA15998@matrix.chaos.earth.li> Message-ID: <404396ef0808280428j5f8ce984q904f5f0e2b0e61ee@mail.gmail.com> Hi > The only thing still being debated here is whether the Data class itself > should remain in base or not. Some people believe that it should remain > in base, as it is desirable to have Data instances for as many types as > possible, and because there is a resistance among library writers > against adding dependencies. The counter argument is that there are many > other classes that the same is true of (e.g. uniplate, syb-with-class, > binary), and it does not scale to put all of these classes into base. My opinion is that Data should remain in base. Data is much lower than other classes, and provides reflection and examination of Haskell values at runtime. You can layer uniplate and binary on top of Data, and the Derive tool's next release will layer an additional 20 classes on top of Data. In some ways Data is more primitive, and more powerful, than other classes. The rest of SYB is a traversal mechansim, which is much more of a library concern, so deserves to be split off. > Also, by requiring a dep to be added even for the classes that have > historically been included in base, adding dependencies for the sake of > providing instances may become more socially acceptable. Keeping dependencies short is good, and it will be hard to persuade most library authors (including me) otherwise. > Splitting off the above 5 packages would leave 106 modules and 16621 LoC > in base. About 5% of the LoC, and 12.5% of the modules, would be in the > new packages. The goal of exposing a consistent base seems a sensible one, so all those changes look good. Thanks Neil From bulat.ziganshin at gmail.com Thu Aug 28 07:42:40 2008 From: bulat.ziganshin at gmail.com (Bulat Ziganshin) Date: Thu Aug 28 07:41:32 2008 Subject: The base library and GHC 6.10 In-Reply-To: <20080828111232.GA15998@matrix.chaos.earth.li> References: <20080828111232.GA15998@matrix.chaos.earth.li> Message-ID: <1644730278.20080828154240@gmail.com> Hello Ian, Thursday, August 28, 2008, 3:12:32 PM, you wrote: > Again, this is a false separation, with 698 LoC left behind in GHC.Conc i propose to consider ghc.* as not the part of Base, but separate library (GhcPrim) which is bundled together with Base only due to technical limitations -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com From leather at cs.uu.nl Thu Aug 28 07:48:32 2008 From: leather at cs.uu.nl (Sean Leather) Date: Thu Aug 28 07:47:02 2008 Subject: The base library and GHC 6.10 In-Reply-To: <20080828111232.GA15998@matrix.chaos.earth.li> References: <20080828111232.GA15998@matrix.chaos.earth.li> Message-ID: <3c6288ab0808280448t1b3ff903w1810804f61e907a0@mail.gmail.com> On Thu, Aug 28, 2008 at 13:12, Ian Lynagh wrote: > Also, GHC's current plan for the base library: > http://hackage.haskell.org/trac/ghc/wiki/DarcsConversion#Planforlibraries > essentially means forking base (as nhc98 would continue to use base in a > darcs repo, while GHC would use it from a git repo, and there are no > plans for any merging between these repos). Therefore any code that is > to be shared between the implementations needs to not be in base, so > from that point of view it would be good to pull out as much as > possible. > ... > First the easy bit: The Data.Generics hierarchy is going to have a > separate maintainer, and I think that everyone is agreed that it should > be pulled out into an "syb package". I'll treat this as not part of base > from here on. > For what it's worth, I would be happy to see "syb" in a git repository. And, if I understood everything correctly, that's what you plan on doing. BTW, thanks for putting out these updates, Ian. They are very helpful. Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080828/5535e3c5/attachment.htm From igloo at earth.li Thu Aug 28 07:55:26 2008 From: igloo at earth.li (Ian Lynagh) Date: Thu Aug 28 07:53:56 2008 Subject: The base library and GHC 6.10 In-Reply-To: <52f14b210808280425s257405f4w9b9bd3878fafdee1@mail.gmail.com> References: <20080828111232.GA15998@matrix.chaos.earth.li> <52f14b210808280425s257405f4w9b9bd3878fafdee1@mail.gmail.com> Message-ID: <20080828115526.GA28866@matrix.chaos.earth.li> On Thu, Aug 28, 2008 at 01:25:25PM +0200, Jos? Pedro Magalh?es wrote: > > Is there a way not to have the Data class in base while still preserving the > deriving mechanism? Yes, you can still have "deriving Data" if the class is in the syb package. Thanks Ian From igloo at earth.li Thu Aug 28 08:01:07 2008 From: igloo at earth.li (Ian Lynagh) Date: Thu Aug 28 07:59:37 2008 Subject: The base library and GHC 6.10 In-Reply-To: <1644730278.20080828154240@gmail.com> References: <20080828111232.GA15998@matrix.chaos.earth.li> <1644730278.20080828154240@gmail.com> Message-ID: <20080828120107.GB28866@matrix.chaos.earth.li> On Thu, Aug 28, 2008 at 03:42:40PM +0400, Bulat Ziganshin wrote: > > Thursday, August 28, 2008, 3:12:32 PM, you wrote: > > > Again, this is a false separation, with 698 LoC left behind in GHC.Conc > > i propose to consider ghc.* as not the part of Base, but separate > library (GhcPrim) which is bundled together with Base only due to > technical limitations Just for interest's sake: Of those 106 modules and 16621 LoC left in base, 33 modules and 8555 LoC are in GHC.* (so about 1/3 and 1/2). The hugs and nhc-specific modules are in other packages. There's a lot of implementation-specific stuff ifdef'ed in the "shared" modules too. Thanks Ian From ndmitchell at gmail.com Thu Aug 28 08:07:20 2008 From: ndmitchell at gmail.com (Neil Mitchell) Date: Thu Aug 28 08:05:49 2008 Subject: The base library and GHC 6.10 In-Reply-To: <1644730278.20080828154240@gmail.com> References: <20080828111232.GA15998@matrix.chaos.earth.li> <1644730278.20080828154240@gmail.com> Message-ID: <404396ef0808280507ha1dcd48h2d61b8e6d5f7ffab@mail.gmail.com> Hi >> Again, this is a false separation, with 698 LoC left behind in GHC.Conc > > i propose to consider ghc.* as not the part of Base, but separate > library (GhcPrim) which is bundled together with Base only due to > technical limitations This is in fact exactly what Hoogle 4 does, compare: http://haskell.org/hoogle/?hoogle=map+%2Bghc http://haskell.org/hoogle/?hoogle=map Thanks Neil From simonpj at microsoft.com Thu Aug 28 08:19:30 2008 From: simonpj at microsoft.com (Simon Peyton-Jones) Date: Thu Aug 28 08:17:59 2008 Subject: The base library and GHC 6.10 In-Reply-To: <20080828115526.GA28866@matrix.chaos.earth.li> References: <20080828111232.GA15998@matrix.chaos.earth.li> <52f14b210808280425s257405f4w9b9bd3878fafdee1@mail.gmail.com> <20080828115526.GA28866@matrix.chaos.earth.li> Message-ID: <638ABD0A29C8884A91BC5FB5C349B1C32AE8697F17@EA-EXMSG-C334.europe.corp.microsoft.com> | > Is there a way not to have the Data class in base while still preserving the | > deriving mechanism? | | Yes, you can still have "deriving Data" if the class is in the syb | package. If you were to change the methods in Data, the deriving stuff would have to change too. That is true but I agree with Neil: Data and Typeable are the basic foundation on which we may build a variety of reflection/introspection libraries, SYB among them. From simonpj at microsoft.com Thu Aug 28 08:20:58 2008 From: simonpj at microsoft.com (Simon Peyton-Jones) Date: Thu Aug 28 08:19:26 2008 Subject: The base library and GHC 6.10 References: <20080828111232.GA15998@matrix.chaos.earth.li> <52f14b210808280425s257405f4w9b9bd3878fafdee1@mail.gmail.com> <20080828115526.GA28866@matrix.chaos.earth.li> Message-ID: <638ABD0A29C8884A91BC5FB5C349B1C32AE8697F1B@EA-EXMSG-C334.europe.corp.microsoft.com> [Darn: somehow sent too early] | | > Is there a way not to have the Data class in base while still preserving the | | > deriving mechanism? | | | | Yes, you can still have "deriving Data" if the class is in the syb | | package. True, but if you were to change the methods in Data, the deriving stuff would have to change too. So putting Data in SYB would make it appear more separate than it truthfully is. Overall, I agree with Neil: Data and Typeable are the basic foundation on which we may build a variety of reflection/introspection libraries, SYB among them. Let's leave 'em in base. Simon From simonpj at microsoft.com Thu Aug 28 09:02:29 2008 From: simonpj at microsoft.com (Simon Peyton-Jones) Date: Thu Aug 28 09:01:00 2008 Subject: The base library and GHC 6.10 In-Reply-To: <20080828111232.GA15998@matrix.chaos.earth.li> References: <20080828111232.GA15998@matrix.chaos.earth.li> Message-ID: <638ABD0A29C8884A91BC5FB5C349B1C32AE8697F78@EA-EXMSG-C334.europe.corp.microsoft.com> | We're trying to decide what to do with the base library for GHC 6.10, in | terms of how much of it should be broken up into separate packages. | Since the recent proposal about this, we may be rethinking what we want | to do, and we would welcome your opinions. Thanks Ian. I found it helpful to number off the advantages and disadvantages so they are easy to refer to, so I enclose a slightly text-processed version of your message below. My thoughts * I find (D2), (D3), and (D4) -- see below -- quite strong reasons for maintaining the status quo * While (A1)-(A3) are advantages, I'm not sure they are powerful enough to want to disturb the status quo in the *short term* (ie before 6.10). * The exception is SYB, for which we have a willing and active maintainer, so (A1) is very strong. That isn't the case for any other package. So my suggestion would be: * for 6.10: split out SYB and nothing else * later: maybe more, let's see Simon =================== Text-processed version of Ian's message =================== We're trying to decide what to do with the base library for GHC 6.10. Specifically we want to work out how much of the current package "base" should be split into separate packages. Since the recent proposal about this (http://hackage.haskell.org/trac/ghc/ticket/1338), we may be rethinking what we want to do, and we would welcome your opinions. Motivation: why split up "base"? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A1. It becomes possible to separately upgrade the parts, and makes it easier for different people to maintain different parts. A2. It makes it easier to see what the hierarchy is, and to restructure the hierarchy, and work towards more of the code being shared between different Haskell implementations. Plus it means that people can't te-tangle the logically separate components, which is all too easy to do when you just have one huge package. A3. It also means that packages are clearer about what they depend on. One possibility, which would be really cool, is to separate all the IO modules from the non-IO modules; between that and looking at the extensions used (e.g. TH and FFI) it would then be clear whether or not a library could do any IO. Of course, the Prelude is a hurdle for this goal. Also, GHC's (still in flux) plan for the base library: http://hackage.haskell.org/trac/ghc/wiki/DarcsConversion#Planforlibraries essentially means forking base (as nhc98 would continue to use base in a darcs repo, while GHC would use it from a git repo, and there are no plans for any merging between these repos). Therefore any code that is to be shared between the implementations needs to not be in base, so from that point of view it would be good to pull out as much as possible. Why *not* split up "base"? ~~~~~~~~~~~~~~~~~~~~~~~~~~ D1. Splitting up base imposes costs on others. Specifically, the dependencies of packages need to be updated to reflect the changes. However, GHC 6.10 will come with a base version 3, as well as the new base version 4, so the transition should be much smoother than the base 2 -> base 3 transition. D2. It would be bad to make a change, and then make *another* change to the same thing. So anywhere there is doubt we should leave htings unchanged D3. Several people expressed reservations about a proliferation of packages containing only one module, or only a little code (less than 500 lines, say). D4. Splitting out a package whose *implementation* depends in an intimate way on "base" is a bit of a false separation. At one extreme a new package could simply re-export a bunch of types and functions from "base". If this is the case, none of A1-A3 hold. What I propose ~~~~~~~~~~~~~~ (In the below, LoC stands for "Lines of Code".) ----- SYB: generic programming ------- First the easy bit: The Data.Generics hierarchy is going to have a separate maintainer, and I think that everyone is agreed that it should be pulled out into an "syb package". I'll treat this as not part of base from here on. The only thing still being debated here is whether the Data class itself should remain in base or not. Some people believe that it should remain in base, as it is desirable to have Data instances for as many types as possible, and because there is a resistance among library writers against adding dependencies. The counter argument is that there are many other classes that the same is true of (e.g. uniplate, syb-with-class, binary), and it does not scale to put all of these classes into base. Also, by requiring a dep to be added even for the classes that have historically been included in base, adding dependencies for the sake of providing instances may become more socially acceptable. ----- GetOpt ------------ System.Console.GetOpt (129 LoC, 1 module) This doesn't really fit in with anything else in base, so I propose to split it off into its own getopt package. I don't think there is much objection to this one. [SLPJ: I am unconvinced.] ----- ST ---------------- Control.Monad.ST Data.STRef (120 LoC, 6 modules) hierarchies. I propose that we put these into an "st" package. The low-level implementation is still in base (69 LoC of in the GHC.ST and GHC.STRef), so to some extent this is a false separation (D4). On the other hand, nhc98 doesn't support ST, so splitting this package off gets us closer to all implementations exposing the same modules from base. ------ Concurrent -------- Control.Concurrent hierarchy (490 LoC, 6 modules) and System.Timeout (39 LoC) Data.Unique (32 LoC) (those latter modules depend on Control.Concurrent.*). I propose that we put these into "concurrent", "timeout" and "unique" packages respectively. Again, this is a false separation, with 698 LoC left behind in GHC.Conc; at some time we'd hope that this could either be moved down to ghc-prim, or make a new ghc-concurrent package for it, depending on how the dependencies work out. Again, nhc doesn't support concurrent or its dependencies, so this gets us closer to a consistent base interface. [SLPJ: I don't think we should split out concurrent yet. I'm pretty certain that we should not generate tiny new packages for "timeout" and "unique".] ------ Summary ------- Splitting off the above 5 packages would leave 106 modules and 16,621 LoC in base. About 5% of the LoC, and 12.5% of the modules, would be in the new packages. [SLPJ: the fact that the change is so small makes me think that A2, A3 are not being helpful. I think there is only a strong case for SYB, becuase of A1.] From kili at outback.escape.de Thu Aug 28 15:50:19 2008 From: kili at outback.escape.de (kili@outback.escape.de) Date: Thu Aug 28 15:53:36 2008 Subject: darcs patch: Unbreak the GHC build with older versions of gcc Message-ID: <200808281950.m7SJoJnR022852@petunia.outback.escape.de> Thu Aug 28 16:27:43 CEST 2008 kili@outback.escape.de * Unbreak the GHC build with older versions of gcc Stg.h must be included before HsBase.h, because the latter contains function definitions causing older versions of gcc (3.3.5 in my case) to bail out with "error: global register variable follows a function definition" on Regs.h, which is included by Stg.h. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/x-darcs-patch Size: 19744 bytes Desc: A darcs patch for your repository! Url : http://www.haskell.org/pipermail/libraries/attachments/20080828/889be082/attachment.bin From ahey at iee.org Fri Aug 29 02:07:10 2008 From: ahey at iee.org (Adrian Hey) Date: Fri Aug 29 02:05:43 2008 Subject: Performance horrors In-Reply-To: References: <48B3B305.2040907@iee.org> <20080826230038.GD15616@sliver.repetae.net> <48B4A02A.8080601@iee.org> Message-ID: <48B7920E.3020209@iee.org> David House wrote: > 2008/8/27 Adrian Hey : >> As does the O(n*(log n)) AVL based nub I just wrote. > > Care to share? Picking up on this thread again..:-) I put an updated AvlTree package in hackage. A bit premature, but what the heck. There's no test in the test suite for it but it seemed to work fine with a few tests in ghci. > WDYT about using RULES to rewrite to nubOrd if an Ord context is > available, as John Meacham mentioned? Probably won't work with mine as I called it nub :-) Other than that it seems like a reasonable idea. Regards -- Adrian Hey From ahey at iee.org Fri Aug 29 02:14:37 2008 From: ahey at iee.org (Adrian Hey) Date: Fri Aug 29 02:13:09 2008 Subject: Performance horrors In-Reply-To: <20080827225121.GA9532@soi.city.ac.uk> References: <48B3B305.2040907@iee.org> <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> <48B43E38.8020808@iee.org> <20080827225121.GA9532@soi.city.ac.uk> Message-ID: <48B793CD.9050108@iee.org> Ross Paterson wrote: > On Tue, Aug 26, 2008 at 06:32:40PM +0100, Adrian Hey wrote: >> 2- Data.Set is not obviously the best underlying implementation (in >> fact it is obviously not the best underlying implementation, this and >> Data.Map etc really should be pensioned off to hackage along with the >> rest of the badly documented, unreliable, inefficient and unstable >> junk :-) > > Do you have benchmarks to quantify how bad Data.Set and Data.Map are? I was just trying to "yank Neils chain" a bit re. something he said a while ago. I don't think they're that bad really as long as you don't use union, intersection etc..(hedge algorithm seems bad). No, I don't have any recent benchmarks but I posted old ones a long time ago if you can find them. I can't :-) Regards -- Adrian Hey From dons at galois.com Fri Aug 29 02:19:36 2008 From: dons at galois.com (Don Stewart) Date: Fri Aug 29 02:18:02 2008 Subject: Performance horrors In-Reply-To: <48B793CD.9050108@iee.org> References: <48B3B305.2040907@iee.org> <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> <48B43E38.8020808@iee.org> <20080827225121.GA9532@soi.city.ac.uk> <48B793CD.9050108@iee.org> Message-ID: <20080829061936.GB15559@scytale.galois.com> ahey: > Ross Paterson wrote: > >On Tue, Aug 26, 2008 at 06:32:40PM +0100, Adrian Hey wrote: > >>2- Data.Set is not obviously the best underlying implementation (in > >>fact it is obviously not the best underlying implementation, this and > >>Data.Map etc really should be pensioned off to hackage along with the > >>rest of the badly documented, unreliable, inefficient and unstable > >>junk :-) > > > >Do you have benchmarks to quantify how bad Data.Set and Data.Map are? > > I was just trying to "yank Neils chain" a bit re. something he said > a while ago. I don't think they're that bad really as long as you > don't use union, intersection etc..(hedge algorithm seems bad). > > No, I don't have any recent benchmarks but I posted old ones a long > time ago if you can find them. I can't :-) Given all the different experiments over the last couple of years, could you perhaps summarise your thoughts on where we should be looking for the next great Data.Map? -- Don From ahey at iee.org Fri Aug 29 02:24:31 2008 From: ahey at iee.org (Adrian Hey) Date: Fri Aug 29 02:23:01 2008 Subject: Performance horrors In-Reply-To: <20080827225325.GM7196@scytale.galois.com> References: <48B3B305.2040907@iee.org> <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> <48B43E38.8020808@iee.org> <20080827225121.GA9532@soi.city.ac.uk> <20080827225325.GM7196@scytale.galois.com> Message-ID: <48B7961F.1030803@iee.org> Don Stewart wrote: > ross: >> On Tue, Aug 26, 2008 at 06:32:40PM +0100, Adrian Hey wrote: >>> 2- Data.Set is not obviously the best underlying implementation (in >>> fact it is obviously not the best underlying implementation, this and >>> Data.Map etc really should be pensioned off to hackage along with the >>> rest of the badly documented, unreliable, inefficient and unstable >>> junk :-) >> Do you have benchmarks to quantify how bad Data.Set and Data.Map are? > > Yes, a clear replacement for just Data.Map , with numbers, even if > prototyped, would be rather compelling. And not bogged down in too much > other code. > > Numbers please, or at least a direction to work in! Not sure what your asking for. I have Data.Map & Data.Set clones but decided not to publish them as I think folk should be migrating to Jamie Brandons GMap API (hopefully first release of that will be in Hackage soon). Jamie is also doing some benchmarking of various implementations of that API (one of them being AVL trees) and Data.Map. Regards -- Adrian Hey From dons at galois.com Fri Aug 29 02:30:49 2008 From: dons at galois.com (Don Stewart) Date: Fri Aug 29 02:29:12 2008 Subject: Performance horrors In-Reply-To: <48B7961F.1030803@iee.org> References: <48B3B305.2040907@iee.org> <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> <48B43E38.8020808@iee.org> <20080827225121.GA9532@soi.city.ac.uk> <20080827225325.GM7196@scytale.galois.com> <48B7961F.1030803@iee.org> Message-ID: <20080829063049.GC15559@scytale.galois.com> ahey: > Don Stewart wrote: > >ross: > >>On Tue, Aug 26, 2008 at 06:32:40PM +0100, Adrian Hey wrote: > >>>2- Data.Set is not obviously the best underlying implementation (in > >>>fact it is obviously not the best underlying implementation, this and > >>>Data.Map etc really should be pensioned off to hackage along with the > >>>rest of the badly documented, unreliable, inefficient and unstable > >>>junk :-) > >>Do you have benchmarks to quantify how bad Data.Set and Data.Map are? > > > >Yes, a clear replacement for just Data.Map , with numbers, even if > >prototyped, would be rather compelling. And not bogged down in too much > >other code. > > > >Numbers please, or at least a direction to work in! > > Not sure what your asking for. I have Data.Map & Data.Set clones but > decided not to publish them as I think folk should be migrating to > Jamie Brandons GMap API (hopefully first release of that will be in > Hackage soon). Jamie is also doing some benchmarking of various > implementations of that API (one of them being AVL trees) and Data.Map. Ok. That's good to know. Benchmark numbers would be great to see! -- Don From cgibbard at gmail.com Fri Aug 29 02:56:09 2008 From: cgibbard at gmail.com (Cale Gibbard) Date: Fri Aug 29 02:54:35 2008 Subject: Performance horrors In-Reply-To: <48B3B305.2040907@iee.org> References: <48B3B305.2040907@iee.org> Message-ID: <89ca3d1f0808282356s2df91169q26cda608f715599f@mail.gmail.com> 2008/8/26 Adrian Hey : > Seeing as practically all Eq instances are also Ord instances, at > the very least we could have O(n*(log n)) definitions for .. > > nub :: Ord a => [a] -> [a] > nub = nubBy compare > > nubBy :: (a -> a -> Ordering) -> [a] -> [a] > nubBy cmp xs ys = -- something using an AVL tree perhaps. While I agree that it would be handy to have Ord-specific versions of these, I'd just like to reiterate/expand on something which Neil mentioned: map head . group . sort has the additional effect of: 1) Demanding the entire input list before it can produce even a single element, and 2) Sorting the result rather than keeping things in the order they first occurred in the input. A correct implementation of nub which made use of Ord would maintain (say) a Data.Set of elements already seen, as it traversed the list lazily, producing elements in the output as soon as new elements were seen in the input, and no later. This of course guarantees that you return them in the order that they're first seen as well. This is still O(n log n), but reduces correctly to O(k log k) when only k elements of the input are needed to get the desired number of elements of the resulting list. Please don't make nub any stricter! On a side note related to the request for inclusion of complexities, since Haskell is a lazy language, we really ought to have complexities written in terms of the demanded portion of the result. Of course, Data.Set and Data.Map are (structure) strict, so it doesn't affect them so much, but it would certainly be nice for the functions in Data.List to know the answer to "if If the input is size n and I only demand k of the elements of the result, then what is the complexity?", especially for things like sort, where a lazy implementation can, for instance, make the head of the list available in just O(n) time. - Cale From duncan.coutts at worc.ox.ac.uk Fri Aug 29 09:49:11 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Fri Aug 29 10:30:50 2008 Subject: Performance horrors In-Reply-To: <89ca3d1f0808282356s2df91169q26cda608f715599f@mail.gmail.com> References: <48B3B305.2040907@iee.org> <89ca3d1f0808282356s2df91169q26cda608f715599f@mail.gmail.com> Message-ID: <1220017751.24846.262.camel@localhost> On Fri, 2008-08-29 at 02:56 -0400, Cale Gibbard wrote: > On a side note related to the request for inclusion of complexities, > since Haskell is a lazy language, we really ought to have complexities > written in terms of the demanded portion of the result. Of course, > Data.Set and Data.Map are (structure) strict, so it doesn't affect > them so much, but it would certainly be nice for the functions in > Data.List to know the answer to "if If the input is size n and I only > demand k of the elements of the result, then what is the complexity?", > especially for things like sort, where a lazy implementation can, for > instance, make the head of the list available in just O(n) time. Yeah, that's fairly important, though it's quite subtle. For example we'd probably say if we demand k elements of the result of map then it costs proportional to k (though we're not accounting for the cost of the function application to each element), but technically that's true for cons or tail too, even though they only do O(1) work and do not change the 'potential'/'cost' of the remainder of the list. Probably we can gloss over the details for a simple indication in the docs, but just for fun here's a spanner: foldr (\x xs -> x `seq` (x:xs)) [] it's only O(1) to eval whnf. It adds O(n) work to the whole structure, but that property doesn't distinguish it from ?foldr (\x xs -> (x:xs)) [] the difference of course is that it identifies costs within the structure. So when a function demands the spine it also has to pay the cost of the elements up front. Duncan From wnoise at ofb.net Sat Aug 30 05:12:50 2008 From: wnoise at ofb.net (Aaron Denney) Date: Sat Aug 30 05:11:30 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> <20080827005741.GK15616@sliver.repetae.net> <117f2cc80808270954w62311818h6ad60b2422aafc44@mail.gmail.com> Message-ID: On 2008-08-27, David Roundy wrote: > On Wed, Aug 27, 2008 at 3:19 AM, Henning Thielemann > wrote: >> It's very confusing for readers of your programs, if you use 'drop 1' >> instead of 'tail'. The names 'drop' and 'tail' don't give the reader a hint, >> that 'drop' works for empty lists and 'tail' doesn't. 'drop 1' and 'tail' >> should behave identically for empty lists and a function with different >> behaviour should have a different name. > > Personally, I'd prefer to see tail dropped from the Prelude (not any > time soon, of course). The fewer incomplete functions are in the > Prelude, the better, and drop has very nice and easy-to-understand > behavior. Folks who want to crash on empty lists should write their > own or use pattern matching. Ditto. It's one of the more noticeable and easily avoidable sources of crashes in my code. I'm not being serious, but of course, it could be defined using fail, which defaults to [] in the List monad... -- Aaron Denney -><- From wnoise at ofb.net Sat Aug 30 07:23:52 2008 From: wnoise at ofb.net (Aaron Denney) Date: Sat Aug 30 07:22:31 2008 Subject: Historical question about type of Data.List.group References: Message-ID: On 2008-08-26, Jim Apple wrote: > Why isn't the type of group > > Eq a => [a] -> [(a,[a])] > > That matches more exactly what group does, and it's easy to see that > functions like > > nubOrd = map fst . group . sort > > are clearly safe, whereas > > map head . group . sort > > is not. I think the big thing is that while this is safer type, it's also much harder to use. Knowing I have a non-empty list is certainly useful information that can be reasoned about at the type level, but if I can't pass them easily to normal list functions, it's much less useful. Perhaps judicious use of typeclasses would make this easier, but I don't think those were in the original either. In any case the design space for a standard prelude that had the normal list type and a non-empty list type united in a sequence class is rather large, particularly if it's an open class that new sequence-like data types are expected to be an instance of. -- Aaron Denney -><- From lennart at augustsson.net Sat Aug 30 07:38:39 2008 From: lennart at augustsson.net (Lennart Augustsson) Date: Sat Aug 30 07:37:03 2008 Subject: Historical question about type of Data.List.group In-Reply-To: References: Message-ID: The current type is easier to use in general. And it was also what I needed back in the days when I added the function to the LML libraries. -- Lennart On Tue, Aug 26, 2008 at 8:45 PM, Jim Apple wrote: > Why isn't the type of group > > Eq a => [a] -> [(a,[a])] > > That matches more exactly what group does, and it's easy to see that > functions like > > nubOrd = map fst . group . sort > > are clearly safe, whereas > > map head . group . sort > > is not. > > Jim > _______________________________________________ > Libraries mailing list > Libraries@haskell.org > http://www.haskell.org/mailman/listinfo/libraries > From wnoise at ofb.net Sat Aug 30 07:59:00 2008 From: wnoise at ofb.net (Aaron Denney) Date: Sat Aug 30 07:57:33 2008 Subject: Performance horrors References: <48B3B305.2040907@iee.org> <404396ef0808260326v6eaf9e40s8ed25295a455ae2a@mail.gmail.com> <125EACD0CAE4D24ABDB4D148C4593DA9049E9541@GBLONXMB02.corp.amvescap.net> Message-ID: On 2008-08-26, Bayley, Alistair wrote: > The name is... well, pessimal might be a bit strong, but few programmers > would think to look for something called "nub". Personally, when I first > looked for it I expected uniq or unique (because that's what the unix > utility that does the same thing is called). Distinct (from SQL) is > another name that occurred to me, but never nub... it's not even a > synonym for unique: http://thesaurus.reference.com/browse/unique Right. It helps a bit to remember it if you think of it as acting on the list, rather than the elements -- it finds the heart of a list, as removing duplicate points in an argument reduces it to its core. -- Aaron Denney -><- From igloo at earth.li Sat Aug 30 08:01:34 2008 From: igloo at earth.li (Ian Lynagh) Date: Sat Aug 30 07:59:58 2008 Subject: The base library and GHC 6.10 In-Reply-To: <404396ef0808280428j5f8ce984q904f5f0e2b0e61ee@mail.gmail.com> References: <20080828111232.GA15998@matrix.chaos.earth.li> <404396ef0808280428j5f8ce984q904f5f0e2b0e61ee@mail.gmail.com> Message-ID: <20080830120134.GA3741@matrix.chaos.earth.li> On Thu, Aug 28, 2008 at 12:28:18PM +0100, Neil Mitchell wrote: > > > The only thing still being debated here is whether the Data class itself > > should remain in base or not. Some people believe that it should remain > > in base, as it is desirable to have Data instances for as many types as > > possible, and because there is a resistance among library writers > > against adding dependencies. The counter argument is that there are many > > other classes that the same is true of (e.g. uniplate, syb-with-class, > > binary), and it does not scale to put all of these classes into base. > > My opinion is that Data should remain in base. OK; so I guess that means the whole Data.Generics.Basics module should stay in base. Thanks Ian From wnoise at ofb.net Sat Aug 30 08:07:33 2008 From: wnoise at ofb.net (Aaron Denney) Date: Sat Aug 30 08:06:09 2008 Subject: 2533: Generic functions that take integral arguments should work the same way as their prelude counterparts References: <404396ef0808220346i5102defarb2e7a7f93408ba98@mail.gmail.com> <20080823173355.14895.qmail@schroeder.cas.mcmaster.ca> <20080823194602.GA11579@craft> <4509E002-69EF-4F0B-839E-0CF08ABBF18A@ece.cmu.edu> <20080824050658.GB28982@craft> <638ABD0A29C8884A91BC5FB5C349B1C32AE85E0A0D@EA-EXMSG-C334.europe.corp.microsoft.com> Message-ID: On 2008-08-26, Simon Peyton-Jones wrote: >| >> I've actually long wondered about this: why don't more functions use >| >> Nat where it'd make sense? It can't be because Nat is hard to define - >| >> I'd swear I've seen many definitions of Nat (if not dozens when you >| >> count all the type-level exercises which include one). >| > >| > Because naive definitions are dog-slow and fast definitions are anything >| > but easy to use? > > I doubt that even GHC is going to optimise > data Nat = Z | S Nat > into > data Nat = N Int# > (with appropriate checks) anytime soon. > > I think the main reason that the latter (which can easily be > implemented as a library) is not more widely used is that it's > tiresomely incompatible with functions that produce Ints. (Or Integers, or Integral a => a). Definitely. > Also perhaps if length :: [a] -> Nat, then computing the difference > between two lengths (length xs - length ys) could produce a runtime > error. It's not so crazy to think that the right thing to do for the subtraction of naturals overflowing is return 0. It fits in with drop and takes behaviour, and arguably zips. There are natural near homomorphisms in place between lists and numbers, and taking negatives as 0 seems to improve the matching a bit. -- Aaron Denney -><- From ross at soi.city.ac.uk Sun Aug 31 08:10:56 2008 From: ross at soi.city.ac.uk (Ross Paterson) Date: Sun Aug 31 08:09:17 2008 Subject: The base library and GHC 6.10 In-Reply-To: <20080830120134.GA3741@matrix.chaos.earth.li> References: <20080828111232.GA15998@matrix.chaos.earth.li> <404396ef0808280428j5f8ce984q904f5f0e2b0e61ee@mail.gmail.com> <20080830120134.GA3741@matrix.chaos.earth.li> Message-ID: <20080831121055.GA5326@soi.city.ac.uk> On Sat, Aug 30, 2008 at 01:01:34PM +0100, Ian Lynagh wrote: > On Thu, Aug 28, 2008 at 12:28:18PM +0100, Neil Mitchell wrote: > > My opinion is that Data should remain in base. > > OK; so I guess that means the whole Data.Generics.Basics module should > stay in base. Should Data.Generics.Instances (an orphanage) be folded into Data.Generics.Basics? From igloo at earth.li Sun Aug 31 08:16:33 2008 From: igloo at earth.li (Ian Lynagh) Date: Sun Aug 31 08:14:53 2008 Subject: The base library and GHC 6.10 In-Reply-To: <20080831121055.GA5326@soi.city.ac.uk> References: <20080828111232.GA15998@matrix.chaos.earth.li> <404396ef0808280428j5f8ce984q904f5f0e2b0e61ee@mail.gmail.com> <20080830120134.GA3741@matrix.chaos.earth.li> <20080831121055.GA5326@soi.city.ac.uk> Message-ID: <20080831121633.GA17919@matrix.chaos.earth.li> On Sun, Aug 31, 2008 at 01:10:56PM +0100, Ross Paterson wrote: > On Sat, Aug 30, 2008 at 01:01:34PM +0100, Ian Lynagh wrote: > > On Thu, Aug 28, 2008 at 12:28:18PM +0100, Neil Mitchell wrote: > > > My opinion is that Data should remain in base. > > > > OK; so I guess that means the whole Data.Generics.Basics module should > > stay in base. > > Should Data.Generics.Instances (an orphanage) be folded into > Data.Generics.Basics? Sounds good to me. Thanks Ian From ross at soi.city.ac.uk Sun Aug 31 08:25:53 2008 From: ross at soi.city.ac.uk (Ross Paterson) Date: Sun Aug 31 08:24:14 2008 Subject: The base library and GHC 6.10 In-Reply-To: <20080831121633.GA17919@matrix.chaos.earth.li> References: <20080828111232.GA15998@matrix.chaos.earth.li> <404396ef0808280428j5f8ce984q904f5f0e2b0e61ee@mail.gmail.com> <20080830120134.GA3741@matrix.chaos.earth.li> <20080831121055.GA5326@soi.city.ac.uk> <20080831121633.GA17919@matrix.chaos.earth.li> Message-ID: <20080831122553.GA5372@soi.city.ac.uk> On Sun, Aug 31, 2008 at 01:16:33PM +0100, Ian Lynagh wrote: > On Sun, Aug 31, 2008 at 01:10:56PM +0100, Ross Paterson wrote: > > On Sat, Aug 30, 2008 at 01:01:34PM +0100, Ian Lynagh wrote: > > > OK; so I guess that means the whole Data.Generics.Basics module should > > > stay in base. > > > > Should Data.Generics.Instances (an orphanage) be folded into > > Data.Generics.Basics? > > Sounds good to me. The name Data.Generics.Basics identifies it as the basic part of the generics library. If it's to be presented as a general class, perhaps the module should be renamed (with re-exports under the old names in syb). Data.Data? From igloo at earth.li Sun Aug 31 10:13:50 2008 From: igloo at earth.li (Ian Lynagh) Date: Sun Aug 31 10:12:10 2008 Subject: The base library and GHC 6.10 In-Reply-To: <20080831122553.GA5372@soi.city.ac.uk> References: <20080828111232.GA15998@matrix.chaos.earth.li> <404396ef0808280428j5f8ce984q904f5f0e2b0e61ee@mail.gmail.com> <20080830120134.GA3741@matrix.chaos.earth.li> <20080831121055.GA5326@soi.city.ac.uk> <20080831121633.GA17919@matrix.chaos.earth.li> <20080831122553.GA5372@soi.city.ac.uk> Message-ID: <20080831141350.GA21092@matrix.chaos.earth.li> On Sun, Aug 31, 2008 at 01:25:53PM +0100, Ross Paterson wrote: > On Sun, Aug 31, 2008 at 01:16:33PM +0100, Ian Lynagh wrote: > > On Sun, Aug 31, 2008 at 01:10:56PM +0100, Ross Paterson wrote: > > > On Sat, Aug 30, 2008 at 01:01:34PM +0100, Ian Lynagh wrote: > > > > OK; so I guess that means the whole Data.Generics.Basics module should > > > > stay in base. > > > > > > Should Data.Generics.Instances (an orphanage) be folded into > > > Data.Generics.Basics? > > > > Sounds good to me. > > The name Data.Generics.Basics identifies it as the basic part of the > generics library. If it's to be presented as a general class, perhaps > the module should be renamed (with re-exports under the old names in syb). > Data.Data? If the old names are in syb then existing libraries need to change either their dependencies or their imports. We could put the old names in base, but deprecate them? Thanks Ian From bulat.ziganshin at gmail.com Sun Aug 31 10:55:36 2008 From: bulat.ziganshin at gmail.com (Bulat Ziganshin) Date: Sun Aug 31 11:02:15 2008 Subject: The base library and GHC 6.10 In-Reply-To: <20080831141350.GA21092@matrix.chaos.earth.li> References: <20080828111232.GA15998@matrix.chaos.earth.li> <404396ef0808280428j5f8ce984q904f5f0e2b0e61ee@mail.gmail.com> <20080830120134.GA3741@matrix.chaos.earth.li> <20080831121055.GA5326@soi.city.ac.uk> <20080831121633.GA17919@matrix.chaos.earth.li> <20080831122553.GA5372@soi.city.ac.uk> <20080831141350.GA21092@matrix.chaos.earth.li> Message-ID: <371698682.20080831185536@gmail.com> Hello Ian, Sunday, August 31, 2008, 6:13:50 PM, you wrote: >> The name Data.Generics.Basics identifies it as the basic part of the >> generics library. If it's to be presented as a general class, perhaps >> the module should be renamed (with re-exports under the old names in syb). >> Data.Data? > If the old names are in syb then existing libraries need to change > either their dependencies or their imports. > We could put the old names in base, but deprecate them? i think that correct solution for all such cases is to provide compatibility reexport only in old base library. programs written with new base in mind should also accommodate changes in module names -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com From patperry at stanford.edu Sun Aug 31 13:28:32 2008 From: patperry at stanford.edu (Patrick Perry) Date: Sun Aug 31 13:26:58 2008 Subject: Bugfix for QuickCheck 1.1.0.0 In-Reply-To: <4E804BD8-22DB-4EDF-AC29-900908603650@stanford.edu> References: <4E804BD8-22DB-4EDF-AC29-900908603650@stanford.edu> Message-ID: <62E8F614-D0C2-46BA-AE22-FFE642C88585@stanford.edu> It's been a week and no one has objected. I'm interpreting your silence on the matter as consent. Could someone with commit access please add the patch and bump the version number for QuickCheck? If you haven't read the ticket, there is a bug in Quickcheck. Currently, this property: prop_f (f :: Double -> Int) = let x = f (-3.4) in x >= 0 || x < 0 will cause QuickCheck to hang. The attached patch fixes the problem. It changes this: variant :: Int -> Gen a -> Gen a variant v (Gen m) = Gen (\n r -> m n (rands r !! (v+1)) where rands r0 = r1 : rands r2 where (r1, r2) = split r0 to this: variant :: Int -> Gen a -> Gen a variant v (Gen m) = Gen (\n r -> m n (rands r v)) where rands r0 0 = r0 rands r0 n = let (r1,r2) = split r0 (n',s) = n `quotRem` 2 in case s of 0 -> rands r1 n' _ -> rands r2 n' Thanks in advance to whoever adds the fix. Patrick On Aug 21, 2008, at 9:20 PM, Patrick Perry wrote: > Hi everyone, > > I've put in a proposal that fixes a bug in QuickCheck 1.1.0.0. > > http://hackage.haskell.org/trac/ghc/ticket/2535 > > Thanks, > > > Patrick > > _______________________________________________ > Libraries mailing list > Libraries@haskell.org > http://www.haskell.org/mailman/listinfo/libraries From johan.tibell at gmail.com Sun Aug 31 13:53:11 2008 From: johan.tibell at gmail.com (Johan Tibell) Date: Sun Aug 31 13:51:29 2008 Subject: Bugfix for QuickCheck 1.1.0.0 In-Reply-To: <62E8F614-D0C2-46BA-AE22-FFE642C88585@stanford.edu> References: <4E804BD8-22DB-4EDF-AC29-900908603650@stanford.edu> <62E8F614-D0C2-46BA-AE22-FFE642C88585@stanford.edu> Message-ID: <90889fe70808311053h4e040b1k55c9e173822399df@mail.gmail.com> On Sun, Aug 31, 2008 at 10:28 AM, Patrick Perry wrote: > It's been a week and no one has objected. I'm interpreting your silence on > the matter as consent. Could someone with commit access please add the > patch and bump the version number for QuickCheck? > > If you haven't read the ticket, there is a bug in Quickcheck. Currently, > this property: > > prop_f (f :: Double -> Int) = let > x = f (-3.4) > in x >= 0 || x < 0 > > will cause QuickCheck to hang. The attached patch fixes the problem. It > changes this: > > variant :: Int -> Gen a -> Gen a > variant v (Gen m) = Gen (\n r -> m n (rands r !! (v+1)) > where > rands r0 = r1 : rands r2 where (r1, r2) = split r0 > > to this: > > variant :: Int -> Gen a -> Gen a > variant v (Gen m) = Gen (\n r -> m n (rands r v)) > where > rands r0 0 = r0 > rands r0 n = let (r1,r2) = split r0 > (n',s) = n `quotRem` 2 > in case s of > 0 -> rands r1 n' > _ -> rands r2 n' Could you explain how this works? It's not entirely clear to me. Thanks. -- Johan From claus.reinke at talk21.com Sun Aug 31 14:11:11 2008 From: claus.reinke at talk21.com (Claus Reinke) Date: Sun Aug 31 14:09:50 2008 Subject: The base library and GHC 6.10 References: <20080828111232.GA15998@matrix.chaos.earth.li><404396ef0808280428j5f8ce984q904f5f0e2b0e61ee@mail.gmail.com><20080830120134.GA3741@matrix.chaos.earth.li><20080831121055.GA5326@soi.city.ac.uk> <20080831121633.GA17919@matrix.chaos.earth.li> Message-ID: "Ian Lynagh" wrote in message news:20080831121633.GA17919@matrix.chaos.earth.li... > On Sun, Aug 31, 2008 at 01:10:56PM +0100, Ross Paterson wrote: >> On Sat, Aug 30, 2008 at 01:01:34PM +0100, Ian Lynagh wrote: >> > On Thu, Aug 28, 2008 at 12:28:18PM +0100, Neil Mitchell wrote: >> > > My opinion is that Data should remain in base. >> > >> > OK; so I guess that means the whole Data.Generics.Basics module should >> > stay in base. >> >> Should Data.Generics.Instances (an orphanage) be folded into >> Data.Generics.Basics? > > Sounds good to me. I just stumbled into this thread by accident. No, the instances should not be folded into Basics. For one, Basics seems likely to remain in base, and half of the instances are under dispute (see previous SYB threads that were copied here). Splitting the instances into standard and dubious prior to deprecating the latter was one of the motivations for moving SYB out of base (apart from Basics and deriving) and for seeking a maintainer for syb. Since José Pedro Magalhães has offered to take on ownership of the syb package, it would be appropriate to cc him on any discussions related to this, so that he is aware of all developments and possible conflicts. Claus From patperry at stanford.edu Sun Aug 31 14:19:00 2008 From: patperry at stanford.edu (Patrick Perry) Date: Sun Aug 31 14:17:21 2008 Subject: Bugfix for QuickCheck 1.1.0.0 In-Reply-To: <90889fe70808311053h4e040b1k55c9e173822399df@mail.gmail.com> References: <4E804BD8-22DB-4EDF-AC29-900908603650@stanford.edu> <62E8F614-D0C2-46BA-AE22-FFE642C88585@stanford.edu> <90889fe70808311053h4e040b1k55c9e173822399df@mail.gmail.com> Message-ID: "variant" takes a generator and an integer, and produces a generator. Here is how it works in the new version: First, n gets written in binary: e.g. n = 10011 Then, we replace "1" with "fst . split". and we replace "0" with "snd . split". n' = l . r . r . l . l where l = fst . split r = snd . split Then, we apply the resulting function to the generator g' = n' g0 The run time is O (log n). In the old version, "split" gets called n times, so the run time is O(n). For certain double values, variant gets called with n on the order of 2^30, hence the hang in QuickCheck. The new version of variant employs the same algorithm as in QuickCheck2. Patrick On Aug 31, 2008, at 10:53 AM, Johan Tibell wrote: > On Sun, Aug 31, 2008 at 10:28 AM, Patrick Perry > wrote: >> It's been a week and no one has objected. I'm interpreting your >> silence on >> the matter as consent. Could someone with commit access please add >> the >> patch and bump the version number for QuickCheck? >> >> If you haven't read the ticket, there is a bug in Quickcheck. >> Currently, >> this property: >> >> prop_f (f :: Double -> Int) = let >> x = f (-3.4) >> in x >= 0 || x < 0 >> >> will cause QuickCheck to hang. The attached patch fixes the >> problem. It >> changes this: >> >> variant :: Int -> Gen a -> Gen a >> variant v (Gen m) = Gen (\n r -> m n (rands r !! (v+1)) >> where >> rands r0 = r1 : rands r2 where (r1, r2) = split r0 >> >> to this: >> >> variant :: Int -> Gen a -> Gen a >> variant v (Gen m) = Gen (\n r -> m n (rands r v)) >> where >> rands r0 0 = r0 >> rands r0 n = let (r1,r2) = split r0 >> (n',s) = n `quotRem` 2 >> in case s of >> 0 -> rands r1 n' >> _ -> rands r2 n' > > Could you explain how this works? It's not entirely clear to me. > > Thanks. > > -- Johan From johan.tibell at gmail.com Sun Aug 31 14:48:15 2008 From: johan.tibell at gmail.com (Johan Tibell) Date: Sun Aug 31 14:46:33 2008 Subject: Bugfix for QuickCheck 1.1.0.0 In-Reply-To: References: <4E804BD8-22DB-4EDF-AC29-900908603650@stanford.edu> <62E8F614-D0C2-46BA-AE22-FFE642C88585@stanford.edu> <90889fe70808311053h4e040b1k55c9e173822399df@mail.gmail.com> Message-ID: <90889fe70808311148i40f48f8bi1cd8e053e62df066@mail.gmail.com> On Sun, Aug 31, 2008 at 11:19 AM, Patrick Perry wrote: > "variant" takes a generator and an integer, and produces a generator. Here > is how it works in the new version: > > First, n gets written in binary: > > e.g. > > n = 10011 > > Then, we replace "1" with "fst . split". and we replace "0" with "snd . > split". > > n' = l . r . r . l . l where l = fst . split > r = snd . split > > Then, we apply the resulting function to the generator > > g' = n' g0 > > > The run time is O (log n). > > In the old version, "split" gets called n times, so the run time is O(n). > For certain double values, variant gets called with n on the order of 2^30, > hence the hang in QuickCheck. The new version of variant employs the same > algorithm as in QuickCheck2. Thanks for the explanation! Cheers, Johan From igloo at earth.li Sun Aug 31 15:19:54 2008 From: igloo at earth.li (Ian Lynagh) Date: Sun Aug 31 15:18:14 2008 Subject: darcs patch: Unbreak the GHC build with older versions of gcc In-Reply-To: <200808281950.m7SJoJnR022852@petunia.outback.escape.de> References: <200808281950.m7SJoJnR022852@petunia.outback.escape.de> Message-ID: <20080831191954.GA32203@matrix.chaos.earth.li> On Thu, Aug 28, 2008 at 09:50:19PM +0200, kili@outback.escape.de wrote: > Thu Aug 28 16:27:43 CEST 2008 kili@outback.escape.de > * Unbreak the GHC build with older versions of gcc > > Stg.h must be included before HsBase.h, because the latter contains > function definitions causing older versions of gcc (3.3.5 in my > case) to bail out with "error: global register variable follows a > function definition" on Regs.h, which is included by Stg.h. > > hunk ./cbits/PrelIOUtils.c 8 > -#include "HsBase.h" > hunk ./cbits/PrelIOUtils.c 9 > +#include "HsBase.h" This breaks the build for me: Building ghc-bin-6.9... [1 of 1] Compiling Main ( Main.hs, dist-stage2/build/ghc/ghc-tmp/Main.o ) Linking dist-stage2/build/ghc/ghc ... /home/ian/ghc/darcs/ghc/libraries/base/dist/build/libHSbase-4.0.a(Internals.o):(.text+0x24): undefined reference to `__hscore_s_issock' /home/ian/ghc/darcs/ghc/libraries/base/dist/build/libHSbase-4.0.a(Internals.o): In function `s3qd_info': (.text+0x2435): undefined reference to `__hscore_s_issock' /home/ian/ghc/darcs/ghc/libraries/base/dist/build/libHSbase-4.0.a(Internals.o): In function `s3qB_info': (.text+0x26c8): undefined reference to `__hscore_s_issock' /home/ian/ghc/darcs/ghc/libraries/base/dist/build/libHSbase-4.0.a(Internals.o): In function `s3uh_info': (.text+0x61e6): undefined reference to `__hscore_s_issock' collect2: ld returned 1 exit status I haven't looked at why yet. Thanks Ian From ross at soi.city.ac.uk Sun Aug 31 17:44:48 2008 From: ross at soi.city.ac.uk (Ross Paterson) Date: Sun Aug 31 17:43:09 2008 Subject: The base library and GHC 6.10 In-Reply-To: References: <20080831121633.GA17919@matrix.chaos.earth.li> Message-ID: <20080831214447.GA20749@soi.city.ac.uk> On Sun, Aug 31, 2008 at 07:11:11PM +0100, Claus Reinke wrote: > > "Ian Lynagh" wrote in message news:20080831121633.GA17919@matrix.chaos.earth.li... > > On Sun, Aug 31, 2008 at 01:10:56PM +0100, Ross Paterson wrote: > >> On Sat, Aug 30, 2008 at 01:01:34PM +0100, Ian Lynagh wrote: > >> > OK; so I guess that means the whole Data.Generics.Basics module should > >> > stay in base. > >> > >> Should Data.Generics.Instances (an orphanage) be folded into > >> Data.Generics.Basics? > > > > Sounds good to me. > > I just stumbled into this thread by accident. No, the instances > should not be folded into Basics. For one, Basics seems likely > to remain in base, and half of the instances are under dispute > (see previous SYB threads that were copied here). Splitting > the instances into standard and dubious prior to deprecating > the latter was one of the motivations for moving SYB out of > base (apart from Basics and deriving) and for seeking a > maintainer for syb. > > Since Jos? Pedro Magalh?es has offered to take on ownership > of the syb package, it would be appropriate to cc him on any > discussions related to this, so that he is aware of all developments > and possible conflicts. Hmm. Of course it's not possible to deprecate instances, and there's only GHC bug #2356 to protect against instance clashes. It does seem a bit contradictory to argue that a class is so basic that it belongs in the core, but its instances for core types are unclear. Well at least the instances for [], tuples, Maybe, Either, Array and type constants could go with the Data class, I presume. Presumably Complex could be given definitions of gfoldl and gmapT, and Ratio a definition of gfoldl, though perhaps not gmapT. From claus.reinke at talk21.com Sun Aug 31 18:02:59 2008 From: claus.reinke at talk21.com (Claus Reinke) Date: Sun Aug 31 18:01:31 2008 Subject: The base library and GHC 6.10 References: <20080828111232.GA15998@matrix.chaos.earth.li><404396ef0808280428j5f8ce984q904f5f0e2b0e61ee@mail.gmail.com><20080830120134.GA3741@matrix.chaos.earth.li><20080831121055.GA5326@soi.city.ac.uk><20080831121633.GA17919@matrix.chaos.earth.li> Message-ID: Leaving Data.Generics.Basics in base while moving Data.Generics.Instances to syb raises the interesting issue of dealing with the accidental re-exports of Data.Generics.Instances from various places. Here is that list again (*): $ find . -name '*hs' | grep -v _darcs | xargs grep -l 'Data.Generics' | grep -v Generics ./array/Data/Array.hs ./base/Data/Typeable.hs ./bytestring/Data/ByteString/Internal.hs ./bytestring/Data/ByteString/Lazy/Internal.hs ./bytestring/Data/ByteString/Unsafe.hs ./containers/Data/IntMap.hs ./containers/Data/IntSet.hs ./containers/Data/Map.hs ./containers/Data/Sequence.hs ./containers/Data/Set.hs ./containers/Data/Tree.hs ./haskell-src/Language/Haskell/Syntax.hs ./network/Network/URI.hs ./packedstring/Data/PackedString.hs ./template-haskell/Language/Haskell/TH/Quote.hs ./template-haskell/Language/Haskell/TH/Syntax.hs And here is a brief scan of what each of these is doing. References to 'standard' vs 'dubious' Data instances are wrt the suggested split in [1], with some possible refinements: - array: the Data instance for Array could be moved into array, avoiding the need for instance imports and syb dependency? - bytestring: uses deriving, which for Internal.hs depends on Data instances for Int [standard] and (ForeignPtr Word8) [dubious]; would need to depend on syb; and import both standard and dubious instances :-( perhaps Data instances for type constructors with phantom types should be re-classified into Standard, given that there are no data objects to be traversed? - containers: IntMap.hs, IntSet.hs, Map.hs, Sequencs.hs, Set.hs, Tree.hs define their own Data instances, or derive them in such a way that they do not need to import any instances :-) - haskell-src: uses deriving, will need to depend on syb; depends almost exclusively on standard instances (the only exception I can see in a quick scan is Rational); perhaps this is an argument in favour of moving the Data instance for 'Ratio a' from Dubious to Standard: the parameter type is never meant to be traversed, and tainting every client of 'Ratio a' with the really bad instances is not a good idea. Opinions? - network: uses deriving, will need to depend on syb; depends only on standard instances - packedstring: defines its own instances, no need to import any - template-haskell: uses deriving, roughly the same situation as for haskell-src? Claus [1] see the last page of http://www.haskell.org/pipermail/libraries/2008-July/010313.html [2] http://www.cs.kent.ac.uk/~cr3/toolbox/haskell/#syb-utils