From martine at danga.com Thu Apr 3 00:07:32 2008 From: martine at danga.com (Evan Martin) Date: Thu Apr 3 00:03:28 2008 Subject: building a helper binary with cabal In-Reply-To: <1206023396.7594.145.camel@localhost> References: <3a6f89fc0803010002g29b98f08qbecdd95eaea13d14@mail.gmail.com> <1204370477.11558.101.camel@localhost> <3a6f89fc0803010917u2b0fc955n49bc052ea99f0643@mail.gmail.com> <3a6f89fc0803010940s6d51c40ftdfdcbb62b2dd21bb@mail.gmail.com> <1204481514.11558.111.camel@localhost> <3a6f89fc0803021434g13966616tbd73e1683cc90a06@mail.gmail.com> <1206023396.7594.145.camel@localhost> Message-ID: <3a6f89fc0804022107q3d1be147kce66088841b35ff6@mail.gmail.com> I got caught up in other things, so I'm also late to reply. Here's the whole story now, so you don't have to reskim the archive. My program has a helper executable that's built with gcc. I want to install it alongside my Haskell binary. I can write a post-copy hook like this: > creplChildCopy :: Args -> CopyFlags -> PackageDescription > -> LocalBuildInfo -> IO () > creplChildCopy args flags desc buildinfo = do > print "copy hook" > let dirs = absoluteInstallDirs desc buildinfo (copyDest flags) > print ("copying child to ", libexecdir dirs creplChildName) > copyFileVerbose (copyVerbose flags) (creplChildPath buildinfo) > (libexecdir dirs creplChildName) And then run it like this: $ ./Setup.lhs copy -v3 directory dist/doc/html/c-repl does exist: False Creating /home/martine/.local/share/doc/c-repl-0.1 (and its parents) copy LICENSE to /home/martine/.local/share/doc/c-repl-0.1/LICENSE Installing: /home/martine/.local/bin Creating /home/martine/.local/bin (and its parents) copy dist/build/c-repl/c-repl to /home/martine/.local/bin/c-repl "copy hook" ("copying child to ","/home/martine/.local/libexec/c-repl-child") copy dist/build/c-repl-child to /home/martine/.local/libexec/c-repl-child Setup.lhs: /home/martine/.local/libexec: copyFile: does not exist (No such file or directory) That fails because this "libexec" dir doesn't exist. My questions are: 1) Is it my responsibility to create the libexec dir if it doesn't exist? 2) How can I get the "copy" phase to run as part of "install"? 3) Do you have advice on installed vs. uninstalled paths? Currently to facilitate development when getting the path to that executable I do something like this (where "Paths" is the module generated by cabal): findChildBinary :: IO (Either String FilePath) findChildBinary = do let path = "dist/build/c-repl-child" ok1 <- isReadable path if ok1 then return (Right path) else do libexecdir <- Paths_c_repl.getLibexecDir let path = libexecdir ++ "/c-repl-child" ok2 <- isReadable path if ok2 then return (Right path) else return (throwError "can't find child executable") where isReadable path = do perms <- getPermissions path return $ readable perms `catch` \e -> return False On Thu, Mar 20, 2008 at 7:29 AM, Duncan Coutts wrote: > Sorry, this dropped of my to-reply-to list, did you get this figured > out? > > Duncan > > > > On Sun, 2008-03-02 at 14:34 -0800, Evan Martin wrote: > > On Sun, Mar 2, 2008 at 10:11 AM, Duncan Coutts > > wrote: > > > The install phase is really two phases, copy and register. The copy > > > phase has the CopyDest param. The default install hook just runs the > > > copy and register phases. So you probably want to override the copy hook > > > and not the install one. > > > > It seems the default copy hook just runs the install hook, and that > > the install hook doesn't run the copy one... ? > > > > http://haskell.org/ghc/docs/latest/html/libraries/Cabal/src/Distribution-Simple.html#simpleUserHooks > > copyHook = \desc lbi _ f -> install desc lbi f, -- has correct > > 'copy' behavior with params > > > > I'm sure I'm just missing something here, but my "postCopy" hook > > doesn't seem to be running with "install -v3". > > > > > The hooks stuff is all really very confusing and unsatisfactory. > > > > I agree, but I can also appreciate how difficult it must be to design, > > and can acknowledge that it may be the case that it really just needs > > to be this complicated. Having used autoconf and friends before, one > > thing I really prefer about this system is that there are bazillion > > different types which helps prevent you from accidentally doing > > something like putting a intermediate object in the source dir or > > installing while ignoring the user's prefix. > > From qdunkan at gmail.com Thu Apr 3 14:01:08 2008 From: qdunkan at gmail.com (Evan Laforge) Date: Thu Apr 3 13:57:04 2008 Subject: Data.Map.toDescList not exported? Message-ID: <2518b95d0804031101r180bb726t3cd9e464581b26ab@mail.gmail.com> It's in the source, marked as being O(n), but it's not in the export list. Oversight? I'm using ghc-6.8.2. I'm using (reverse.toAscList) but I think running 'head' on that is going to be less efficient than on the real toDescList. Thanks! From gwern0 at gmail.com Mon Apr 7 11:50:54 2008 From: gwern0 at gmail.com (gwern0@gmail.com) Date: Mon Apr 7 11:49:49 2008 Subject: darcs patch: fmt cabal (and 9 more) Message-ID: <47fa4398.0cba5e0a.3adc.6f0b@mx.google.com> Mon Apr 7 00:50:28 EDT 2008 gwern0@gmail.com * fmt cabal Mon Apr 7 00:50:49 EDT 2008 gwern0@gmail.com * .cabal: +category Mon Apr 7 00:53:00 EDT 2008 gwern0@gmail.com * .cabal: change homepage and links The original listed website, , seems to be quite dead. I've replaced it with a link to the haskell wiki and to the old chalmers site. Mon Apr 7 00:57:49 EDT 2008 gwern0@gmail.com * improve README Mon Apr 7 00:59:04 EDT 2008 gwern0@gmail.com * .cabal: add a real description Mon Apr 7 01:00:18 EDT 2008 gwern0@gmail.com * .cabal: BSD4 -> BSD3 license change The 4-clause BSD includes the advertising clause, and is 4 clauses long. The provided LICENSE file doesn't seem to have an advertising clause for anyone, and seems to have only 3 clauses. Which is good, as people don't like the advertising BSD license - why it's deprecated. Mon Apr 7 01:01:46 EDT 2008 gwern0@gmail.com * .cabal: +Bringert to copyright Per the license file, which mentions two copyright holders. Mon Apr 7 01:05:34 EDT 2008 gwern0@gmail.com * -Wall Monadic.hs Mon Apr 7 11:46:44 EDT 2008 gwern0@gmail.com * some misc partial -Wall Mon Apr 7 11:50:03 EDT 2008 gwern0@gmail.com * .cabal: don't forget the examples -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/x-darcs-patch Size: 23565 bytes Desc: A darcs patch for your repository! Url : http://www.haskell.org/pipermail/libraries/attachments/20080407/830a06cf/attachment-0001.bin From dons at galois.com Mon Apr 7 16:24:54 2008 From: dons at galois.com (Don Stewart) Date: Mon Apr 7 16:20:42 2008 Subject: [Haskell] making inits less strict In-Reply-To: References: Message-ID: <20080407202454.GF28281@scytale.galois.com> john.tromp: > The standard definition of inits: > > inits [] = [[]] > inits (x:xs) = [[]] ++ map (x:) (inits xs) > > is unnecessarily strict, evaluating its argument > before yielding the initial [] of the result. > An improved version is: > > inits l = [] : case l of [] -> [] > (x:xs) -> map (x:) inits xs > > This allows one to define for instance > nats = map length (inits nats) > which loops for the standard definition. > Can you forward this to the libraries@haskell.org list, and file a proposal to replace the current definition? http://haskell.org/haskellwiki/Library_submissions We noticed this while implementing lazy bytestrings, which had a similar issue, and fixing it is cheap enough. -- Don From johan.tibell at gmail.com Sun Apr 13 07:59:45 2008 From: johan.tibell at gmail.com (Johan Tibell) Date: Sun Apr 13 07:55:07 2008 Subject: RFC: A standardized interface between web servers and applications or frameworks (ala WSGI) Message-ID: <90889fe70804130459u25bf182dvf4bec5ca69d6112e@mail.gmail.com> Good day hackers, The Python community have been successful in standardizing an interface between web server and applications or frameworks resulting in users having more control over their web stack by being able to pick frameworks independently from web servers, and vice versa. I propose we try to do the same for Haskell. I've written half a draft for a Haskell version of Python's PEP 333 [1]. If you're interested in taking part in this effort please read through the Python spec first (as it is way more complete and you can understand this proposal better by reading it, I've skipped some important issues in my first draft) and then go read the Haskell spec [2]. I'm particularly interesting in feedback regarding: * Doing in this way won't work as it violates HTTP/CGI spec part X, Y and Z (the Python spec takes lots of things from the CGI spec including naming and semantics). * My server/framework could never provide/be run under this interface. * This interface has bad performance by design. * Using a different set of data types would work better. The spec needs to be extended to cover all the corners of HTTP. Some parts need to be motivated better. It is easier for me to motivate things if people would tell me what parts are badly motivated. Note: I'm open to a complete rewrite if needed. I'm not wedded to the current design and/or wording. In fact parts of the wording is borrowed from the Python spec. The parts with bad grammar are all mine. 1. http://www.python.org/dev/peps/pep-0333/ 2. http://www.haskell.org/haskellwiki/WebApplicationInterface -- Johan From agl at imperialviolet.org Sun Apr 13 19:06:43 2008 From: agl at imperialviolet.org (Adam Langley) Date: Sun Apr 13 19:02:04 2008 Subject: RFC: A standardized interface between web servers and applications or frameworks (ala WSGI) In-Reply-To: <90889fe70804130459u25bf182dvf4bec5ca69d6112e@mail.gmail.com> References: <90889fe70804130459u25bf182dvf4bec5ca69d6112e@mail.gmail.com> Message-ID: <396556a20804131606h35c0cd75p2076fd20bdd67582@mail.gmail.com> On Sun, Apr 13, 2008 at 4:59 AM, Johan Tibell wrote: > * Using a different set of data types would work better. Give that this is Haskell, I'd suggest more types ;) HTTP headers aren't just strings and, at the risk of tooting my own horn, I'll point to the Headers structure in [1]. Likewise, URLs have lots of structure that should just be handled in one place [2] [1] http://darcs.imperialviolet.org/darcsweb.cgi?r=network-minihttp;a=headblob;f=/Network/MiniHTTP/Marshal.hs [2] http://darcs.imperialviolet.org/darcsweb.cgi?r=network-minihttp;a=headblob;f=/Network/MiniHTTP/URL.hs AGL -- Adam Langley agl@imperialviolet.org http://www.imperialviolet.org From dm.maillists at gmail.com Sun Apr 13 21:08:56 2008 From: dm.maillists at gmail.com (Daniel McAllansmith) Date: Sun Apr 13 21:04:32 2008 Subject: RFC: A standardized interface between web servers and applications or frameworks (ala WSGI) In-Reply-To: <396556a20804131606h35c0cd75p2076fd20bdd67582@mail.gmail.com> References: <90889fe70804130459u25bf182dvf4bec5ca69d6112e@mail.gmail.com> <396556a20804131606h35c0cd75p2076fd20bdd67582@mail.gmail.com> Message-ID: <200804141308.56697.dm.maillists@gmail.com> On Mon, 14 Apr 2008 11:06:43 Adam Langley wrote: > On Sun, Apr 13, 2008 at 4:59 AM, Johan Tibell wrote: > > * Using a different set of data types would work better. > > Give that this is Haskell, I'd suggest more types ;) > > HTTP headers aren't just strings and, at the risk of tooting my own > horn, I'll point to the Headers structure in [1]. And it could go further. The use of a given header is often valid only in certain requests or responses. Perhaps sprinkling some phantom types or type classes around could represent that. Daniel From cdsmith at twu.net Sun Apr 13 21:32:07 2008 From: cdsmith at twu.net (Chris Smith) Date: Sun Apr 13 21:27:45 2008 Subject: RFC: A standardized interface between web servers and applications or frameworks (ala WSGI) References: <90889fe70804130459u25bf182dvf4bec5ca69d6112e@mail.gmail.com> <396556a20804131606h35c0cd75p2076fd20bdd67582@mail.gmail.com> Message-ID: On Sun, 13 Apr 2008 16:06:43 -0700, Adam Langley wrote: > On Sun, Apr 13, 2008 at 4:59 AM, Johan Tibell > wrote: >> * Using a different set of data types would work better. > > Give that this is Haskell, I'd suggest more types ;) > > HTTP headers aren't just strings and, at the risk of tooting my own > horn, I'll point to the Headers structure in [1]. Wait, I'm not sure I agree here. How are headers not just strings? By assuming that, are we guaranteeing that anything using this interface cannot respond gracefully to a client that writes malformed headers? Another perspective: there is unnecessary variation there in how interfaces are represented. If I'm looking for a header, and I know its name as a string, how do I look for it? Well, apparently it's either a named field (if it's known to the interface) or in the "other" section (if not). So I write a gigantic case analysis? But then suppose the interface is updated later to include some headers that popped up unofficially but then are standardized in a future RFC. (This is not too odd; lots of REST web services invent new headers every day, many of which do things that make sense outside of the particular application.) Does old code that handled these headers stop working, just because it was looking in the "other" section, but now needs to check a field dedicated to that header? > Likewise, URLs have > lots of structure that should just be handled in one place [2] This I do agree with. -- Chris Smith From s.clover at gmail.com Sun Apr 13 22:43:33 2008 From: s.clover at gmail.com (Sterling Clover) Date: Sun Apr 13 22:38:55 2008 Subject: [web-devel] RFC: A standardized interface between web servers and applications or frameworks (ala WSGI) In-Reply-To: <90889fe70804130459u25bf182dvf4bec5ca69d6112e@mail.gmail.com> References: <90889fe70804130459u25bf182dvf4bec5ca69d6112e@mail.gmail.com> Message-ID: <368E0883-A910-4FFD-937F-A6A1D7F88E42@gmail.com> In a sense, the CGIT interface provided by Network.CGI already is a sort of halfway implementation of what we're discussing, no? I'd be interested in approaching this from the other way -- specifying exactly what CGIT doesn't provide and therefore what folks want to see. As far as I can tell, the main issue with CGIT is that it doesn't handle streaming/resource issues very well. The main innovation I see provided here is the enumerator interface, which is a very nice and flexible approach to I/O and provides a way to handle comet cleanly to boot. Since the application type as proposed is Env -> IO (Code, Headers, ResponseEnumerator), what we're really getting is almost an equiv. (modulo enumerators) of unwrapping CGIT IO CGIResponse with a run function. So what we lose is the ability for all our nicely named record accessors and functions to be shared across frameworks -- i.e. the flexibility a monad transformer *does* provide. So my question is if we can somehow preserve that with an appropriate typeclass. I'd ideally like to see this engineered in two parts -- a "cgit-like" typeclass interface that allows access to the environment but is agnostic as to response type, so that comet-style and other apps that take special advantage of enumerators can be built on top of it as well as apps that simply perform lazy writes; and the lower-level enumerator interface. This ideally would let the higher-level interface be built over any stack at all (i.e. STM-based as well, or even a pure stack), while the lower level interface that calls it is some glue of the given constant type in the IO monad. This would be of great help to hvac. There's also the fact that this could be designed ground-up with greater bytestring use, but that doesn't seem immense to me. Outside of this, I'm not quite sure what else CGIT lacks. I'm with Chris Smith's arguments as to the headers question, and it seems to me that dicts are best done using MVar-style primitives. I'm a bit at sea as to why the queryString is here just represented as a bytestring -- is it seriously an issue that some apps may want to use it other than in the standard parsed way? Is the idea here that lib functions would fill in and be shared among frameworks? On the other hand, seperating GET and POST vars is a good idea, and its a shame that CGIT doesn't allow this. The openness here seems in part based on the desire to keep different forms of file upload handling available. However, the work that oleg did with regards to CGI also seems promising -- i.e., rather than using an enumerator, simply taking advantage of laziness to unpack the input stream into a lazy dictionary. Regards, S. On Apr 13, 2008, at 7:59 AM, Johan Tibell wrote: > Good day hackers, > > The Python community have been successful in standardizing an > interface between web server and applications or frameworks resulting > in users having more control over their web stack by being able to > pick frameworks independently from web servers, and vice versa. I > propose we try to do the same for Haskell. I've written half a draft > for a Haskell version of Python's PEP 333 [1]. If you're interested in > taking part in this effort please read through the Python spec first > (as it is way more complete and you can understand this proposal > better by reading it, I've skipped some important issues in my first > draft) and then go read the Haskell spec [2]. I'm particularly > interesting in feedback regarding: > > * Doing in this way won't work as it violates HTTP/CGI spec part X, Y > and Z (the Python spec takes lots of things from the CGI spec > including naming and semantics). > * My server/framework could never provide/be run under this interface. > * This interface has bad performance by design. > * Using a different set of data types would work better. > > The spec needs to be extended to cover all the corners of HTTP. Some > parts need to be motivated better. It is easier for me to motivate > things if people would tell me what parts are badly motivated. > > Note: I'm open to a complete rewrite if needed. I'm not wedded to the > current design and/or wording. In fact parts of the wording is > borrowed from the Python spec. The parts with bad grammar are all > mine. > > 1. http://www.python.org/dev/peps/pep-0333/ > 2. http://www.haskell.org/haskellwiki/WebApplicationInterface > > -- Johan > _______________________________________________ > web-devel mailing list > web-devel@haskell.org > http://www.haskell.org/mailman/listinfo/web-devel From john at repetae.net Sun Apr 13 22:55:12 2008 From: john at repetae.net (John Meacham) Date: Sun Apr 13 22:50:34 2008 Subject: RFC: A standardized interface between web servers and applications or frameworks (ala WSGI) In-Reply-To: <90889fe70804130459u25bf182dvf4bec5ca69d6112e@mail.gmail.com> References: <90889fe70804130459u25bf182dvf4bec5ca69d6112e@mail.gmail.com> Message-ID: <20080414025512.GJ25982@sliver.repetae.net> Haskell works fine with both the FastCGI and SCGI protocols (There are libraries floating around for both), I have found them much nicer than any mod_* web server plugin in general. John -- John Meacham - ?repetae.net?john? From agl at imperialviolet.org Sun Apr 13 23:27:35 2008 From: agl at imperialviolet.org (Adam Langley) Date: Sun Apr 13 23:22:57 2008 Subject: RFC: A standardized interface between web servers and applications or frameworks (ala WSGI) In-Reply-To: References: <90889fe70804130459u25bf182dvf4bec5ca69d6112e@mail.gmail.com> <396556a20804131606h35c0cd75p2076fd20bdd67582@mail.gmail.com> Message-ID: <396556a20804132027p6d00e5bdw89518241a39a6ee3@mail.gmail.com> On Sun, Apr 13, 2008 at 6:32 PM, Chris Smith wrote: > Does old code that handled these headers stop working, just because it > was looking in the "other" section, but now needs to check a field > dedicated to that header? Yes, but it would be very sad if we couldn't do common header parsing because of this. I'd suggest that all the headers given in RFC 2616 be parsed and nothing else. That leaves the question of how we would handle the addition of any extra ones in the future. Firstly, packages could depend on a given version of this interface and we declare that the set of handled headers doesn't change within a major version. Better would be some static assertion that the interface doesn't handle some set of headers. Maybe there's a type trick to do this, but I can't think of one, so we might have to settle for a non static: checkUnparsedHeaders :: [String] -> IO () Which can be put in 'main' (or equivalent) and can call error if there's a mismatch. AGL -- Adam Langley agl@imperialviolet.org http://www.imperialviolet.org From mj at mjclement.com Sun Apr 13 23:31:54 2008 From: mj at mjclement.com (Michaeljohn Clement) Date: Sun Apr 13 23:27:17 2008 Subject: [Haskell-cafe] RFC: A standardized interface between web servers and applications or frameworks (ala WSGI) In-Reply-To: <90889fe70804130459u25bf182dvf4bec5ca69d6112e@mail.gmail.com> References: <90889fe70804130459u25bf182dvf4bec5ca69d6112e@mail.gmail.com> Message-ID: <4802D02A.9070203@mjclement.com> I am very interested in this work. One thing missing is support for all HTTP methods, not just those in RFC 2616. As-is, something like WebDAV cannot be implemented. That probably means requestMethod should be a (Byte)String. What about something like sendfile(2) available on some platforms? To allow the server to make use of such optimizations, how about an optional alternative to the enumerator? Perhaps the app can optionally pass a path or an open fd back to the server in place of an enumerator, which allows servers to do any kind of buffering optimizations, etc, that they may know about, as well as using any platform-specific optimizations like sendfile. -- Michaeljohn Clement From mj at mjclement.com Sun Apr 13 23:50:48 2008 From: mj at mjclement.com (Michaeljohn Clement) Date: Sun Apr 13 23:46:14 2008 Subject: [Haskell-cafe] Re: RFC: A standardized interface between web servers and applications or frameworks (ala WSGI) In-Reply-To: <396556a20804131606h35c0cd75p2076fd20bdd67582@mail.gmail.com> References: <90889fe70804130459u25bf182dvf4bec5ca69d6112e@mail.gmail.com> <396556a20804131606h35c0cd75p2076fd20bdd67582@mail.gmail.com> Message-ID: <4802D498.7000303@mjclement.com> Adam Langley wrote: > On Sun, Apr 13, 2008 at 4:59 AM, Johan Tibell wrote: >> * Using a different set of data types would work better. > > Give that this is Haskell, I'd suggest more types ;) > > HTTP headers aren't just strings and, at the risk of tooting my own > horn, I'll point to the Headers structure in [1]. That is one of the things I don't like about Network.HTTP, which also enumerates header fields. It is inconvenient to have to look up the names in the data type, when the standard field names are already known, and it makes using non-RFC2616 headers less convenient. Automatic parsing of header fields also makes unusual usage inconvenient, (for example the Range header support in [1] is a profile of RFC2616.) I think those kinds of features belong in frameworks; they will be more of an annoyance than a help to anyone that is writing to the WSGI layer. > Likewise, URLs have > lots of structure that should just be handled in one place [2] Yes, I think should be parsed to the level of granularity specified by RFC 2616 (i.e. scheme, host, port, path, query string) and anything more (like parsing query strings) should be handled by frameworks. > > [1] http://darcs.imperialviolet.org/darcsweb.cgi?r=network-minihttp;a=headblob;f=/Network/MiniHTTP/Marshal.hs > [2] http://darcs.imperialviolet.org/darcsweb.cgi?r=network-minihttp;a=headblob;f=/Network/MiniHTTP/URL.hs > > > AGL > -- Michaeljohn Clement From dm.maillists at gmail.com Mon Apr 14 01:17:24 2008 From: dm.maillists at gmail.com (Daniel McAllansmith) Date: Mon Apr 14 01:13:09 2008 Subject: RFC: A standardized interface between web servers and applications or frameworks (ala WSGI) In-Reply-To: References: <90889fe70804130459u25bf182dvf4bec5ca69d6112e@mail.gmail.com> <396556a20804131606h35c0cd75p2076fd20bdd67582@mail.gmail.com> Message-ID: <200804141717.24442.dm.maillists@gmail.com> On Mon, 14 Apr 2008 13:32:07 Chris Smith wrote: > On Sun, 13 Apr 2008 16:06:43 -0700, Adam Langley wrote: > > On Sun, Apr 13, 2008 at 4:59 AM, Johan Tibell > > > > wrote: > >> * Using a different set of data types would work better. > > > > Give that this is Haskell, I'd suggest more types ;) > > > > HTTP headers aren't just strings and, at the risk of tooting my own > > horn, I'll point to the Headers structure in [1]. > > Wait, I'm not sure I agree here. How are headers not just strings? Headers, at least their values, aren't strings. The specification says so. I think headers should be represented by something more specific than a string. > By > assuming that, are we guaranteeing that anything using this interface > cannot respond gracefully to a client that writes malformed headers? Having explicit types for headers doesn't preclude trying to handle messages with malformed headers. Soldiering on in the face of malformed messages as a general strategy is pretty dubious in my opinion. In the specific cases where you've determined it is necessary you want to be able to register a work-around parser for that section of the message, and be able to tell that it has been used. A decent framework can supply a catalogue of commonly required work-arounds. > > Another perspective: there is unnecessary variation there in how > interfaces are represented. If I'm looking for a header, and I know its > name as a string, how do I look for it? Well, apparently it's either a > named field (if it's known to the interface) or in the "other" section > (if not). So I write a gigantic case analysis? But then suppose the > interface is updated later to include some headers that popped up > unofficially but then are standardized in a future RFC. (This is not too > odd; lots of REST web services invent new headers every day, many of > which do things that make sense outside of the particular application.) > Does old code that handled these headers stop working, just because it > was looking in the "other" section, but now needs to check a field > dedicated to that header? I don't like the idea of having a fixed enumeration of methods or headers. You need to be able to define new methods and headers at will, and ideally have the usage of headers constrained to valid contexts. This suggests to me type classes that establish a 'can occur in' relationship between request/response, method and a given general/request/response/entity header. By importing new method or header data type, appropriate type class instances and registering an appropriate message parser extension you can mix and match which headers and methods you support. GET and HEAD are the only ones that MUST be supported after all. Daniel -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080414/a485fb4d/attachment.htm From daniel.yokomizo at gmail.com Mon Apr 14 07:54:07 2008 From: daniel.yokomizo at gmail.com (Daniel Yokomizo) Date: Mon Apr 14 07:49:31 2008 Subject: RFC: A standardized interface between web servers and applications or frameworks (ala WSGI) In-Reply-To: <396556a20804132027p6d00e5bdw89518241a39a6ee3@mail.gmail.com> References: <90889fe70804130459u25bf182dvf4bec5ca69d6112e@mail.gmail.com> <396556a20804131606h35c0cd75p2076fd20bdd67582@mail.gmail.com> <396556a20804132027p6d00e5bdw89518241a39a6ee3@mail.gmail.com> Message-ID: On Mon, Apr 14, 2008 at 3:27 AM, Adam Langley wrote: > On Sun, Apr 13, 2008 at 6:32 PM, Chris Smith wrote: > > Does old code that handled these headers stop working, just because it > > was looking in the "other" section, but now needs to check a field > > dedicated to that header? > > Yes, but it would be very sad if we couldn't do common header parsing > because of this. > > I'd suggest that all the headers given in RFC 2616 be parsed and > nothing else. Both request and response accept any entity headers and 7.1 (of RFC 2616) says that a valid entity header is an extension header, which can be any kind of header. > That leaves the question of how we would handle the > addition of any extra ones in the future. Firstly, packages could > depend on a given version of this interface and we declare that the > set of handled headers doesn't change within a major version. > > Better would be some static assertion that the interface doesn't > handle some set of headers. Maybe there's a type trick to do this, but > I can't think of one, so we might have to settle for a non static: > > checkUnparsedHeaders :: [String] -> IO () > > Which can be put in 'main' (or equivalent) and can call error if > there's a mismatch. Most of the times a Header makes sense in some scenarios and doesn't in others, so a package level checking is too coarse grained. IMHO it would be better to create a two layered approach. The bottom layer handles the request as a bunch of strings, just checks for structural correctness (i.e. break the headers by line and such) without checking if the headers are correct. The top layer provides a bunch of parser combinators to validate, parse and sanitize the request so a library can create its own contract: newtype Contract e a = Contract (HttpRequest -> e a) contract :: Contract Maybe MyRequest contract = do pragma <- parseHeader "Pragma" (\header -> ...) ... return $ MyRequest pragma ... main = do request <- readHttpRequest sanitized <- enforce contract request ... Such approach would be more flexible and extensible. Later other packages could provide specialized combinators for other RFCs. HTTP is regularly extended, in RFCs and by private parties experimenting before writing an RFC, it would be bad if the primary Haskell library for HTTP didn't support this behavior. Also it's important to notice that the HTTP spec defines things to be mostly orthogonal, so most of the headers stand on their own and can be used in combination with many methods and other headers, every once in a while someone finds a combination that makes sense and wasn't thought of before. > AGL > > -- > Adam Langley agl@imperialviolet.org http://www.imperialviolet.org Best regards, Daniel Yokomizo. From agl at imperialviolet.org Mon Apr 14 12:14:02 2008 From: agl at imperialviolet.org (Adam Langley) Date: Mon Apr 14 12:09:20 2008 Subject: RFC: A standardized interface between web servers and applications or frameworks (ala WSGI) In-Reply-To: References: <90889fe70804130459u25bf182dvf4bec5ca69d6112e@mail.gmail.com> <396556a20804131606h35c0cd75p2076fd20bdd67582@mail.gmail.com> <396556a20804132027p6d00e5bdw89518241a39a6ee3@mail.gmail.com> Message-ID: <396556a20804140914o7e71f8cam73421444bc775a23@mail.gmail.com> On Mon, Apr 14, 2008 at 4:54 AM, Daniel Yokomizo wrote: > Both request and response accept any entity headers and 7.1 (of RFC > 2616) says that a valid entity header is an extension header, which > can be any kind of header. Is wasn't suggesting that other headers be dropped, just that they remain as strings. > IMHO it would be better to create a two layered approach. The bottom > layer handles the request as a bunch of strings, just checks for > structural correctness (i.e. break the headers by line and such) > without checking if the headers are correct. The top layer provides a > bunch of parser combinators to validate, parse and sanitize the > request so a library can create its own contract: Ok, I think I'm convinced by this argument. I'd hope that a standard set of header parsers be defined, and that an application which only cares about 2616 headers can do call a single function to parse them all, but I no longer advocate that the base interface use parsed forms of headers. Also, parsing URLs seems to be pretty uncontroversial (maybe parsing key, value pairs from the path, maybe not) AGL -- Adam Langley agl@imperialviolet.org http://www.imperialviolet.org From johan.tibell at gmail.com Fri Apr 18 17:19:04 2008 From: johan.tibell at gmail.com (Johan Tibell) Date: Fri Apr 18 17:20:13 2008 Subject: RFC: A standardized interface between web servers and applications or frameworks (ala WSGI) In-Reply-To: <90889fe70804130459u25bf182dvf4bec5ca69d6112e@mail.gmail.com> References: <90889fe70804130459u25bf182dvf4bec5ca69d6112e@mail.gmail.com> Message-ID: <90889fe70804181419t2036319bhbf9800d44db8d551@mail.gmail.com> First, apologies for not responding earlier. I spent my week at a conference in Austria. Second, thanks for all the feedback! I thought I go through some of my thoughts on the issues raised. Just to try to reiterate the goals of this effort: * To provide a common, no frills interface between web servers and applications or frameworks to increase choice for application developers. * To make that interface easy enough to implement so current web servers and frameworks will implement it. This is crucial for it being adopted. * Avoid design decisions that would limit the number of frameworks that can use the interface. One example of a limiting decisions would be one that limits the maximal possible performance by using e.g. inefficient data types. I'll try to start with what seems to be the easier issues. sendfile(2) support =================== I would like see this supported in the interface. I didn't include it in the first draft as I didn't have a good idea of where to put it. One idea would be to add the following field to the Environment record: sendfile :: Maybe (FD -> IO ()) Possibly with additional parameters as needed. The reason that sendfile needs to be included in the environment instead of just a binding to the C function is that the Socket used for the connection is hidden from the application side and its use is abstracted by the input and output enumerators. The other suggested solution (to return either an Enumerator or a file descriptor) might work better. I just wanted to communicate that I think it should be included. Extension HTTP methods ====================== I did have extension methods in mind when I wrote the draft but didn't include it. I see two possible options. 1. Change the HTTP method enumeration to: data Method = Get | ... | ExtensionMethod ByteString 2. Treat all methods as bytestrings: type Method = ByteString This treatment touches on the discussion on typing further down in this email. I still haven't thought enough about the consequences (if indeed there are any of any importance) of the two approaches. The Enumerator type =================== To recap, I proposed the following type for the Enumerator abstraction: type Enumerator = forall a. (a -> ByteString -> IO (Either a a)) -> a -> IO a The IO monad is a must both in the return type of the Enumerator function and in the iteratee function (i.e. the first parameter of the enumartor). IO in the return type of the enumerator is a must since the server must perform I/O (i.e. reading from the client socket) to provide the input and the application might need to perform I/O to create the response. The appearance of the IO monad in the iteratee functions is an optimization. It makes it possible for the server or application to act immediately when a chunk of data is received. This saves memory when large files are being sent as they can be written to disk/network immediately instead of being cached in memory. There are some different design (and possibly performance trade-offs) that could be made. The current enumerator type can be viewed as an unrolled State monad suggesting that it would be possible to change the type to: type Enumerator = forall t. MonadTrans t => (ByteString -> t IO (Either a a)) -> t IO a which is a more general type allowing for an arbitrary monad stack. Some arguments against doing this: * The unrolled state version is analogous to a left fold (and can indeed be seen as one) and should thus be familiar to all Haskell programmers. * A, possibly unfounded, worry I have is that it might be hard to optimize way the extra abstraction layer putting a performance tax on all applications, whether they use the extra flexibility or not. It would be great if any of the Takusen authors (or Oleg since he wrote the enumerator paper) could comment on this. Note: I haven't thought this one through. It was suggested to me on #haskell and I thought I should at least bring it up. Extra environment variables =========================== I've intended all along to include a field for remaining, optional extra pieces of information taken from e.g. the web server, the shell, etc. I haven't come up with an good name for this field by the idea is to add another field to the Environment: data Environment = Environment { ... , extraEnvironment :: [(ByteString, ByteString)] } Typing and data types ===================== Most discussions seem to, perhaps unsurprisingly, have centered around the use of data types and typing in general. Let me start by giving an assumptions I've used when writing this draft: Existing frameworks already have internal representations of the request URL, headers, etc. Changing these would be costly. Even if this was done I don't think it is possible to pick any one type that all frameworks could use to represent an HTTP requests or even parts of a request. Different frameworks need different types. Let me as an example use the Last-Modified header field. Assume we used named record fields for all headers: data Headers = Headers { ... , lastModified :: Maybe ??? } There is no type we could use for ??? that would be useful for all frameworks. There are several possible DateTime types possible with different design trade-offs. On the other hand, all frameworks likely already have a function to convert raw bytes to whatever internal representation used in that particular framework. Trying to provide more structured types that bytestrings appears to have two drawbacks: 1. It adds boiler plate type conversion code. No benefit is gained by the extra typing. Defining more types in the WAI interface adds complexity to the interface. 2. It adds an unnecessary performance penalty. My suggestions is this: We use a minimal number of types in the interface and leave it up to higher levels to add these. Summary ======= I suggest that the overall design principle should be this: Give a data type (e.g. Environment) with a minimal amount of structure corresponding to the one given in the HTTP RFC plus some extra optional environment provided by the web server and the environment (e.g. shell) in which it is run. I suggest we leave the raw bytestrings in the interface as interpreting them is best done by the framework. This also lends itself to an efficient implementation as the bytestrings in the environment could just be substrings (an O(1) operation) of the raw input read from the socket. Let me make a slight reservation here and say that I might want to split the URL into two parts (e.g. SCRIPT_NAME and PATH_INFO, like in CGI and WSGI). The reason for doing this is that it makes it much easier to nest applications by having each layer consuming one part of the URL and leave the rest to the nested application. For example, consider the task of writing an URL dispatcher that picks different applications depending on the URL prefix: storeApp, adminApp, urlMap :: Application urlMap = mkUrlMap [("/admin", adminApp) ,("/store", storeApp) ] serve :: Application -> IO () -- Provided by the web server. main = serve urlMap When a request for /store/items/1 reaches the URL mapper application it consumes the initial prefix (and puts it in scriptName) and leaves the remaining URL part in pathInfo. adminApp or storeApp can then use what's left in pathInfo to do further dispatching (to a handler function for example). Phew, this turned into a longer email than I thought. If I forgot to respond to any points raised please don't be afraid to raise them again. From duncan.coutts at worc.ox.ac.uk Sun Apr 20 16:22:56 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Sun Apr 20 16:17:53 2008 Subject: Specifying dependencies on Haskell code Message-ID: <1208722976.5960.101.camel@dell.linuxdev.us.dell.com> All, In the initial discussions on a common architecture for building applications and libraries one of the goals was to reduce or eliminate untracked dependencies. The aim being that you could reliably deploy a package from one machine to another. We settled on a fairly traditional model, where one specifies the names and versions of packages of Haskell code. An obvious alternative model is embodied in ghc --make and in autoconf style systems where you look in the environment not for packages but rather for specific modules or functions. Both models have passionate advocates. There are of course advantages and disadvantages to each. Both models seem to get implemented as reactions having the other model inflicted on the author. For example the current Cabal model of package names and versions was a reaction to the perceived problem of untracked dependencies with the ghc --make system. One could see implementations such as searchpath and franchise as reactions in the opposite direction. The advantages and disadvantages of specifying dependencies on module names vs package names and versions are mostly inverses. Module name clashes between packages are problematic with one system and not a problem with the other. Moving modules between packages is not a problem for one system and a massive pain for the other. The fact is that both module name and package name + version are being used as proxies to represent some vague combination of required Haskell interface and implementation thereof. Sometimes people intend only to specify an interface and sometimes people really want to specify (partial) semantics (eg to require a version of something including some bug fix / semantic change). In this situation the package version is being used to specify an implementation as a proxy for semantics. Neither are very good ways of identifying an interface or implementation/semantics. Modules do move from one package to another without fundamentally changing. Modules do change interface and semantics without changing name. There is no guarantee about the relationship between a package's version and its interface or semantics though there are some conventions. Another view would be to try and identify the requirements about dependent code more accurately. For example to view modules as functors and look at what interface they require of the modules they import. Then we can say that they depend on any module that provides a superset of that interface. It doesn't help with semantics of course. Dependencies like these are not so compact and easy to write down. I don't have any point here exactly, except that there is no obvious solution. I guess I'd like to provoke a bit of a discussion on this, though hopefully not just rehashing known issues. In particular if people have any ideas about how we could improve either model to address their weak points then that'd be well worth discussing. For example the package versioning policy attempts to tighten the relationship between a package version and changes in its interface and semantics. It still does not help at all with modules moving between packages. Duncan From dons at galois.com Sun Apr 20 18:09:33 2008 From: dons at galois.com (Don Stewart) Date: Sun Apr 20 18:04:38 2008 Subject: Announce: bytestring 0.9.1.0 Message-ID: <20080420220933.GA29143@scytale.galois.com> Hey all, I'm pleased to announce a new major release of bytestring, the efficient string library for Haskell, suitable for high-performance scenarios. This release is primarily an (incremental) performance improvement release, though with some notable significant improvements, along with long term test coverage and quality control changes. Highlights: * a long term performance bug with Ord instances, involving very small strings, and Data.Map has been squashed. * everything's a little faster -- shootout problems showed a 1-5% speedup just by switching to the new library. Thanks goes to the Hac4 Haskell Hackathon organisers, in Gothenburg, Sweden, where the majority of this work to create this release took place. Key changes: * Data.Map short key performance greatly improved: - 'words Map' running time: 6.310s bytestring 0.9.0.1 1.071s bytestring 0.9.1.0 * Uses cheaper unsafeDupablePerformIO for allocation. - tail recursive tight loops (fixes obscure stack overflow) * Generally faster: - Shootout sum-file: 1.218s to 1.190s - Shooout fasta: 9.210s to 8.811s * 4-5x faster small substring search (breakSubstring/findSubstring/isInfixOf). * Extensive QuickCheck coverage reporting and improvements: - http://code.haskell.org/~dons/tests/bytestring/hpc_index.html Get the code: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/bytestring Note that if you upgrade to the new bytestring release, older packages built against previous releases will still require the old bytestring package. For best results, rebuild any bytestring-depending packages against the new library only. Cheers, Don From gwern0 at gmail.com Sun Apr 20 18:48:28 2008 From: gwern0 at gmail.com (Gwern Branwen) Date: Sun Apr 20 18:46:34 2008 Subject: [Haskell-cafe] Announce: bytestring 0.9.1.0 In-Reply-To: <20080420220933.GA29143@scytale.galois.com> References: <20080420220933.GA29143@scytale.galois.com> Message-ID: <20080420224828.GA7794@localhost> On 2008.04.20 15:09:33 -0700, Don Stewart scribbled 1.7K characters: > > Hey all, > > I'm pleased to announce a new major release of bytestring, the efficient > string library for Haskell, suitable for high-performance scenarios. > > This release is primarily an (incremental) performance improvement > release, though with some notable significant improvements, along with > long term test coverage and quality control changes. > > Highlights: > > * a long term performance bug with Ord instances, involving very > small strings, and Data.Map has been squashed. > > * everything's a little faster -- shootout problems showed a 1-5% > speedup just by switching to the new library. > > Thanks goes to the Hac4 Haskell Hackathon organisers, in Gothenburg, > Sweden, where the majority of this work to create this release took place. > > Key changes: > > * Data.Map short key performance greatly improved: > - 'words Map' running time: > 6.310s bytestring 0.9.0.1 > 1.071s bytestring 0.9.1.0 > > * Uses cheaper unsafeDupablePerformIO for allocation. > - tail recursive tight loops (fixes obscure stack overflow) > > * Generally faster: > - Shootout sum-file: 1.218s to 1.190s > - Shooout fasta: 9.210s to 8.811s > > * 4-5x faster small substring search (breakSubstring/findSubstring/isInfixOf). > > * Extensive QuickCheck coverage reporting and improvements: > - http://code.haskell.org/~dons/tests/bytestring/hpc_index.html > > Get the code: > > http://hackage.haskell.org/cgi-bin/hackage-scripts/package/bytestring > > Note that if you upgrade to the new bytestring release, older packages > built against previous releases will still require the old bytestring > package. For best results, rebuild any bytestring-depending packages > against the new library only. > > Cheers, > Don That's all good news; will this release of ByteString be used for GHC 6.8.3? I'm a little tired of linking everything against 0.9.0.1 just so I can use Yi (since GHC/the-GHC-API links against it). :) -- gwern NSDM USP Edens SAS kibo quarter NSES Gamma MP5k threat -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://www.haskell.org/pipermail/libraries/attachments/20080420/6928d11e/attachment.bin From fox at ucw.cz Mon Apr 21 03:38:42 2008 From: fox at ucw.cz (Milan Straka) Date: Mon Apr 21 03:33:41 2008 Subject: Data.Map improvement patch Message-ID: <20080421073842.GA24983@atrey.karlin.mff.cuni.cz> Hello, Recently I was going through the Data.Set implementation and I was taken aback by setting the parameter \delta to 4. The implementation is based on Adams' bounded trees and uses ratio \alpha=2, in which case the paper clearly states \delta must be at least 4.646 (which is written in the source file Set.hs several lines above the assignment \delta=4; Map.hs uses \delta=5). So I read that Adams' paper again and found out, that the analysis described there contains two major mistakes [it is good to read the analysis from the paper now if you want to follow, pages 26 to 31]. 1) The analysis assumes that just before the balancing rotation, the heavier subtree of the unbalanced vertex was inserted to, i.e. the balance is ``off by 1''. But when the lighter subtree is deleted from, the balance is ``off by \delta'', which is worse. When taken into account, it results in equation $\delta\le{n\over\alpha}+1$, which must hold for each positive integer n. 2) The cases of empty subtrees are not analysed. When considered, it follows that for \alpha=2 a single rotation can restore balance only when \delta<5. [tree with 1 left vertex and 5 right, delete the left one and voila...] Right now I was puzzled why both Data.Set and Data.Map implementations seem to work. *) Data.Set: The constraint from the paper \delta >= 4.646 comes from the inequality $(1/\alpha) >= {\delta(1+{1\over n})+1\over\delta^2-1}$, which must hold for each positive integer n. The worst case is n=1, which is chosen in the paper. But when using n=2, the resulting constraint is \delta>3.792. The remaining case n=1 can be solved by examining all possible situations for \alpha<=2 and \delta=4, which all reveal to be correctly balanced using a~single rotation. Also the new constraint $\delta\le{n\over\alpha}+1$ can be satisfied for \alpha=2 and \delta=4 by examining small cases separately. The thing is the inequalities from the paper do not contain the floor function. Sure, for general parameters it is difficult to handle, but for the specific parameters the results are much better when truncation is taken into account. *) Data.Map: The constraint \delta<5 for \alpha=2 is not possible to change without changing the algorithm. But it turns out that the Data.Map implementation performs the balancing rotations not when one subtree is more than \delta times heavier than the other as described in the paper, but when one subtree is at least \delta times heavier. So the Data.Map implementation behaves like the analysed implementation with \delta just a bit smaller than 5. Ok, so both 4 and 5 work for the implementation of Data.Map and Data.Set. Which one is better? This parameter affects the height of a tree. With \delta=4 the worst case is 1:4, so the height is log_{5/4}n ~ 3.1 log_2 n. With \delta=5 the height is log_{6/5}n ~ 3.8 log_2 n. In Haskell, lot of operations must copy a whole path in a tree -- so a lower tree means less time and memory needed. There are more rotations, but we have to construct new nodes anyway, so one would expect minor advantage of \delta=4 over \delta=5. It is difficult to measure these things in Haskell for me (just adding a SPECIALIZE of increasing/decreasing heap size can change order of execution time), but anyway, here are times of 50000 sequential inserts, then 800000 lookups and 50000 deletes (the source is attached for the freaks): Data.Set with \delta=4 180 300 72 Data.Set with \delta=5 208 304 76 Data.Map with \delta=4 212 328 84 Data.Map with \delta=5 256 324 88 IntSet 56 356 48 IntMap 56 360 48 So I would suggest to change "delta = 5" to "delta = 4" in the Data.Map.hs and leave "delta = 4" in the Data.Set.hs file. The darcs patch is included :) Any comments are welcome, Milan -------------- next part -------------- A non-text attachment was scrubbed... Name: data.map.4.bench.tgz Type: application/x-gtar Size: 56140 bytes Desc: not available Url : http://www.haskell.org/pipermail/libraries/attachments/20080421/8f3cd4e7/data.map.4.bench-0001.gtar -------------- next part -------------- A non-text attachment was scrubbed... Name: data.map.4.patch.gz Type: application/octet-stream Size: 22468 bytes Desc: not available Url : http://www.haskell.org/pipermail/libraries/attachments/20080421/8f3cd4e7/data.map.4.patch-0001.obj From duncan.coutts at worc.ox.ac.uk Mon Apr 21 05:26:16 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Mon Apr 21 05:21:08 2008 Subject: Data.Map improvement patch In-Reply-To: <20080421073842.GA24983@atrey.karlin.mff.cuni.cz> References: <20080421073842.GA24983@atrey.karlin.mff.cuni.cz> Message-ID: <1208769976.5960.156.camel@dell.linuxdev.us.dell.com> On Mon, 2008-04-21 at 09:38 +0200, Milan Straka wrote: > It is difficult to measure these things in Haskell for me (just adding a > SPECIALIZE of increasing/decreasing heap size can change order of execution > time), but anyway, here are times of 50000 sequential inserts, then 800000 > lookups and 50000 deletes (the source is attached for the freaks): > Data.Set with \delta=4 180 300 72 > Data.Set with \delta=5 208 304 76 > Data.Map with \delta=4 212 328 84 > Data.Map with \delta=5 256 324 88 > IntSet 56 356 48 > IntMap 56 360 48 Sorry, I'm confused, what are the columns here exactly? Duncan From fox at ucw.cz Mon Apr 21 07:29:12 2008 From: fox at ucw.cz (Milan Straka) Date: Mon Apr 21 07:24:07 2008 Subject: Data.Map improvement patch In-Reply-To: <1208769976.5960.156.camel@dell.linuxdev.us.dell.com> References: <20080421073842.GA24983@atrey.karlin.mff.cuni.cz> <1208769976.5960.156.camel@dell.linuxdev.us.dell.com> Message-ID: <20080421112912.GA6105@atrey.karlin.mff.cuni.cz> Hello, > > On Mon, 2008-04-21 at 09:38 +0200, Milan Straka wrote: > > > It is difficult to measure these things in Haskell for me (just adding a > > SPECIALIZE of increasing/decreasing heap size can change order of execution > > time), but anyway, here are times of 50000 sequential inserts, then 800000 > > lookups and 50000 deletes (the source is attached for the freaks): > > Data.Set with \delta=4 180 300 72 > > Data.Set with \delta=5 208 304 76 > > Data.Map with \delta=4 212 328 84 > > Data.Map with \delta=5 256 324 88 > > IntSet 56 356 48 > > IntMap 56 360 48 > > Sorry, I'm confused, what are the columns here exactly? Sorry, I should made myself more clear. The first column is the time of 50000 sequential inserts ([1..50000]), the second column is the time of 800000 lookups [bigger number because they are fast, 800k=16*50k], the third column is the time od 50000 sequential deletes ([1..50000]). Everything is measured in ms. One can think of a test with random sequence -- the source can be modified easily to support it. The results are (in the same formatting) Random seq; Data.Set with \delta=4 300 636 256 Random seq; Data.Set with \delta=5 300 644 256 Random seq; Data.Map with \delta=4 372 692 268 Random seq; Data.Map with \delta=5 376 700 272 Random seq; IntSet 216 572 168 Random seq; IntMap 228 584 176 As one should expect, they are inconclusive. Thats because when the data inserted are truly random, the rebalancing is not really needed. (What is more interesting is the time bloat of all tests. I think it might be caused by being able to inline [1..50000] better than random list... Maybe a list fusion? I do not really know.) Hello, Milan PS: Here is a diff of the random test: --- Test.hs 2008-04-21 09:33:55.000000000 +0200 +++ TestNew.hs 2008-04-21 13:17:21.000000000 +0200 @@ -8,10 +8,17 @@ import System.CPUTime +import System.Random force_list list = sum list {-# SPECIALIZE force_list::[Int]->Int #-} +gen_rnd n = rnd (mkStdGen 42) (Map4.fromList [(b,0)|b<-[1..n]]) where + rnd g m | Map4.null m = [] + rnd g m = let n = Map4.size m + (i,g') = randomR (0,n-1) g + in fst (Map4.elemAt i m) : rnd g' (Map4.deleteAt i m) + test::Int->c->(Int->c->c)->(Int->c->Bool)->(Int->c->c)->(c->[Int])->IO () @@ -20,23 +27,23 @@ test n cs insert find delete tolist = do - let ls=[1..n] + let ls=gen_rnd n ts<-force_list ls `seq` getCPUTime From dons at galois.com Mon Apr 21 14:11:19 2008 From: dons at galois.com (Don Stewart) Date: Mon Apr 21 14:06:27 2008 Subject: ANNOUNCE: Galois web libraries for Haskell released Message-ID: <20080421181119.GB8601@scytale.galois.com> Galois, Inc. is pleased to announce the open source release of a suite of web programming libraries for Haskell! The following libraries are available, providing support for a wide range of Haskell web programming scenarios: * json JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language, Standard ECMA-262 3rd Edition - December 1999. This library provides a validating parser and pretty printer for converting between Haskell values and JSON. http://hackage.haskell.org/cgi-bin/hackage-scripts/package/json * xml A simple, lightweight XML parser/generator. http://hackage.haskell.org/cgi-bin/hackage-scripts/package/xml * utf8-string A UTF8 layer for IO and Strings. The utf8-string package provides operations for encoding UTF8 strings to Word8 lists and back, and for reading and writing UTF8 without truncation. http://hackage.haskell.org/cgi-bin/hackage-scripts/package/utf8-string * selenium Haskell bindings to communicate with a Selenium Remote Control server. This package makes it possible to use Haskell to write test scripts that exercise web applications through a web browser. http://hackage.haskell.org/cgi-bin/hackage-scripts/package/selenium * curl libcurl is a client-side URL transfer library, supporting FTP, FTPS, HTTP, HTTPS, SCP, SFTP, TFTP, TELNET, DICT, LDAP, LDAPS and FILE. libcurl supports SSL certificates, HTTP POST, HTTP PUT, FTP uploading, HTTP form based upload, proxies, cookies, user+password authentication (Basic, Digest, NTLM, Negotiate, Kerberos4), file transfer resume, http proxy tunneling and more! This package provides a Haskell binding to libcurl. http://hackage.haskell.org/cgi-bin/hackage-scripts/package/curl * sqlite Haskell binding to sqlite3 , a light, fast database. http://hackage.haskell.org/cgi-bin/hackage-scripts/package/sqlite * feed Interfacing with RSS (v 0.9x, 2.x, 1.0) and Atom feeds http://hackage.haskell.org/cgi-bin/hackage-scripts/package/feed * mime Haskell support for working with MIME types. http://hackage.haskell.org/cgi-bin/hackage-scripts/package/mime Together these fill in a big chunk of the web programming stack for Haskell. Get the code! You can find all the cabalised packages on hackage.haskell.org. About: Galois researches, designs and develops high assurance technologies for security-critical systems, networks and applications. We use Haskell as a primary development tool for producing robust components for a diverse range of clients. Web-based technologies are increasingly important in this area, and we believe Haskell has a key role to play in the production of reliable, secure web software. The culture of correctness Haskell encourages is ideally suited to web programming, where issues of security, authentication, privacy and protection of resources abound. In particular, Haskell's type system makes possible strong static guarantees about access to resources, critical to building reliable web applications. We hope that the release of this suite of libraries to the community will push further the adoption of Haskell in the domain of web programming. This release brought to you by: Iavor Diatchki Trevor Elliott Sigbjorn Finne Andy Gill Eric Mertens Isaac Potoczny-Jones Don Stewart Aaron Tomb From bertram.felgenhauer at googlemail.com Mon Apr 21 17:16:23 2008 From: bertram.felgenhauer at googlemail.com (Bertram Felgenhauer) Date: Mon Apr 21 17:11:23 2008 Subject: proposal: reimplement Data.Graph.Inductive.Query.Dominators Message-ID: <20080421211623.GA4360@zombie.inf.tu-dresden.de> Hi, (this is Trac ticket #2227, http://hackage.haskell.org/trac/ghc/ticket/2227) As pointed out at http://www.haskell.org/pipermail/haskell-cafe/2008-April/041739.html ff., Data.Graph.Inductive.Query.Dominators.dom is buggy. Furthermore, it's slow, so instead of submitting the quick fix from that thread, I've rewritten the module from scratch using a more efficient algorithm. The algorithm works by calculating the immediate dominators of the graph nodes first, so the patch also adds a function that returns those. It should be handy for flow graph analysis. Deadline for discussion ... is a week sufficient? That would be April 28th. regards, Bertram From dons at galois.com Mon Apr 21 18:35:44 2008 From: dons at galois.com (Don Stewart) Date: Mon Apr 21 18:30:41 2008 Subject: proposal: reimplement Data.Graph.Inductive.Query.Dominators In-Reply-To: <20080421211623.GA4360@zombie.inf.tu-dresden.de> References: <20080421211623.GA4360@zombie.inf.tu-dresden.de> Message-ID: <20080421223544.GJ8601@scytale.galois.com> bertram.felgenhauer: > Hi, > > (this is Trac ticket #2227, > http://hackage.haskell.org/trac/ghc/ticket/2227) > > As pointed out at > http://www.haskell.org/pipermail/haskell-cafe/2008-April/041739.html > ff., Data.Graph.Inductive.Query.Dominators.dom is buggy. Furthermore, > it's slow, so instead of submitting the quick fix from that thread, > I've rewritten the module from scratch using a more efficient algorithm. > > The algorithm works by calculating the immediate dominators of the graph > nodes first, so the patch also adds a function that returns those. It > should be handy for flow graph analysis. > > Deadline for discussion ... is a week sufficient? > That would be April 28th. > fgl is maintained by Martin Erwig, for these kind of non-core libraries its easier/faster to submit the patch directly to the maintainer. If the maintainer times out (possible here), we can declare it orphaned, and its free game. -- Don From ross at soi.city.ac.uk Mon Apr 21 18:42:00 2008 From: ross at soi.city.ac.uk (Ross Paterson) Date: Mon Apr 21 18:36:58 2008 Subject: proposal: reimplement Data.Graph.Inductive.Query.Dominators In-Reply-To: <20080421223544.GJ8601@scytale.galois.com> References: <20080421211623.GA4360@zombie.inf.tu-dresden.de> <20080421223544.GJ8601@scytale.galois.com> Message-ID: <20080421224200.GA6255@soi.city.ac.uk> On Mon, Apr 21, 2008 at 03:35:44PM -0700, Don Stewart wrote: > bertram.felgenhauer: > > (this is Trac ticket #2227, > > http://hackage.haskell.org/trac/ghc/ticket/2227) > > fgl is maintained by Martin Erwig, for these kind of non-core libraries > its easier/faster to submit the patch directly to the maintainer. Yes, library proposal tickets are for centrally maintained packages, i.e. those with a Maintainer field of libraries@haskell.org. From bertram.felgenhauer at googlemail.com Mon Apr 21 19:25:13 2008 From: bertram.felgenhauer at googlemail.com (Bertram Felgenhauer) Date: Mon Apr 21 19:20:14 2008 Subject: proposal: reimplement Data.Graph.Inductive.Query.Dominators In-Reply-To: <20080421223544.GJ8601@scytale.galois.com> References: <20080421211623.GA4360@zombie.inf.tu-dresden.de> <20080421223544.GJ8601@scytale.galois.com> Message-ID: <20080421232513.GC4360@zombie.inf.tu-dresden.de> Don Stewart wrote: > fgl is maintained by Martin Erwig, for these kind of non-core libraries > its easier/faster to submit the patch directly to the maintainer. Well, the fgl version that I had in mind is hosted at http://darcs.haskell.org/packages/fgl it's an extra (but not, granted, a core) library in ghc, and the darcs repository lists libraries@haskell.org as the address to send patches to. Interestingly, at least part of the bug that prompted the discussion is not present in the version from Martin Erwig's homepage. On the other hand, that version does not work with ghc 6.8.2, due to MArray changes. > If the maintainer times out (possible here), we can declare it orphaned, > and its free game. Fair enough. Bertram From ahey at iee.org Mon Apr 21 19:27:50 2008 From: ahey at iee.org (Adrian Hey) Date: Mon Apr 21 19:22:45 2008 Subject: Data.Map improvement patch In-Reply-To: <20080421073842.GA24983@atrey.karlin.mff.cuni.cz> References: <20080421073842.GA24983@atrey.karlin.mff.cuni.cz> Message-ID: <480D22F6.3050301@iee.org> Milan Straka wrote: > Any comments are welcome, Maybe just trash the lot and use AVL trees instead? :-) Regards -- Adrian Hey From bertram.felgenhauer at googlemail.com Mon Apr 21 20:25:59 2008 From: bertram.felgenhauer at googlemail.com (Bertram Felgenhauer) Date: Mon Apr 21 20:20:54 2008 Subject: proposal: reimplement Data.Graph.Inductive.Query.Dominators In-Reply-To: <20080421232513.GC4360@zombie.inf.tu-dresden.de> References: <20080421211623.GA4360@zombie.inf.tu-dresden.de> <20080421223544.GJ8601@scytale.galois.com> <20080421232513.GC4360@zombie.inf.tu-dresden.de> Message-ID: <30e962d20804211725x47db5f7awaa49eb1a4e87035@mail.gmail.com> I wrote: > Interestingly, at least part of the bug that prompted the discussion > is not present in the version from Martin Erwig's homepage. Never mind. The bug is present, I just remembered it wrong. > > If the maintainer times out (possible here), we can declare it orphaned, > > and its free game. > > Fair enough. I've contacted Martin Erwig now, and left a comment to that effect on the Trac ticket. regards, Bertram From jgoerzen at complete.org Tue Apr 22 09:20:09 2008 From: jgoerzen at complete.org (John Goerzen) Date: Tue Apr 22 09:15:19 2008 Subject: [Haskell-cafe] ANNOUNCE: Galois web libraries for Haskell released In-Reply-To: <20080421181119.GB8601@scytale.galois.com> References: <20080421181119.GB8601@scytale.galois.com> Message-ID: <200804220820.10015.jgoerzen@complete.org> On Mon April 21 2008 1:11:19 pm Don Stewart wrote: > Galois, Inc. is pleased to announce the open source release of a suite of > web programming libraries for Haskell! Lots of cool stuff here! A few questions: > * xml > A simple, lightweight XML parser/generator. > > http://hackage.haskell.org/cgi-bin/hackage-scripts/package/xml Can you describe how this compares to HaXml? Were there deficiencies in HaXml? > * sqlite > Haskell binding to sqlite3 , a light, fast > database. > > http://hackage.haskell.org/cgi-bin/hackage-scripts/package/sqlite Similar questions here regarding HDBC. Did HDBC (and HDBC-sqlite3) not address some need? > * feed > Interfacing with RSS (v 0.9x, 2.x, 1.0) and Atom feeds > > http://hackage.haskell.org/cgi-bin/hackage-scripts/package/feed Sweet. Might have to refactor hpodder to use this. > > * mime > Haskell support for working with MIME types. > > http://hackage.haskell.org/cgi-bin/hackage-scripts/package/mime FWIW, I have some similar but slightly different functions in MissingH. http://software.complete.org/static/missingh/doc//MissingH/Data-MIME-Types.html hsemail and WASH both also have some stuff in this area. Probably not as nice as yours though. -- John From dons at galois.com Tue Apr 22 13:20:34 2008 From: dons at galois.com (Don Stewart) Date: Tue Apr 22 13:15:56 2008 Subject: [Haskell-cafe] ANNOUNCE: Galois web libraries for Haskell released In-Reply-To: <200804220820.10015.jgoerzen@complete.org> References: <20080421181119.GB8601@scytale.galois.com> <200804220820.10015.jgoerzen@complete.org> Message-ID: <20080422172034.GA27147@scytale.galois.com> jgoerzen: > On Mon April 21 2008 1:11:19 pm Don Stewart wrote: > > Galois, Inc. is pleased to announce the open source release of a suite of > > web programming libraries for Haskell! > > Lots of cool stuff here! A few questions: > > > * xml > > A simple, lightweight XML parser/generator. > > > > http://hackage.haskell.org/cgi-bin/hackage-scripts/package/xml > > Can you describe how this compares to HaXml? Were there deficiencies in > HaXml? Much smaller, fewer dependencies. I think of it as the "tagsoup" of xml parsers. > > * sqlite > > Haskell binding to sqlite3 , a light, fast > > database. > > > > http://hackage.haskell.org/cgi-bin/hackage-scripts/package/sqlite > > Similar questions here regarding HDBC. Did HDBC (and HDBC-sqlite3) not > address some need? Yes, we needed full, low-level access to sqlite for some unusual use cases. For high level stuff, HDBC and Takusen are nicer. > > * feed > > Interfacing with RSS (v 0.9x, 2.x, 1.0) and Atom feeds > > > > http://hackage.haskell.org/cgi-bin/hackage-scripts/package/feed > > Sweet. Might have to refactor hpodder to use this. > > > > > * mime > > Haskell support for working with MIME types. > > > > http://hackage.haskell.org/cgi-bin/hackage-scripts/package/mime > > FWIW, I have some similar but slightly different functions in MissingH. > > http://software.complete.org/static/missingh/doc//MissingH/Data-MIME-Types.html > > hsemail and WASH both also have some stuff in this area. Probably not as > nice as yours though. We should probably bundle up a bunch of the small mime libs into a single package at some point. -- Don From jgoerzen at complete.org Tue Apr 22 13:32:50 2008 From: jgoerzen at complete.org (John Goerzen) Date: Tue Apr 22 13:27:58 2008 Subject: [Haskell-cafe] ANNOUNCE: Galois web libraries for Haskell released In-Reply-To: <20080422172034.GA27147@scytale.galois.com> References: <20080421181119.GB8601@scytale.galois.com> <200804220820.10015.jgoerzen@complete.org> <20080422172034.GA27147@scytale.galois.com> Message-ID: <200804221232.51444.jgoerzen@complete.org> On Tue April 22 2008 12:20:34 pm Don Stewart wrote: > Yes, we needed full, low-level access to sqlite for some unusual use > cases. For high level stuff, HDBC and Takusen are nicer. Can you elaborate on these use cases? I would like to either add support for them to HDBC-sqlite3, or perhaps make HDBC-sqlite3 a wrapper around your library. -- John From dons at galois.com Tue Apr 22 13:35:10 2008 From: dons at galois.com (Don Stewart) Date: Tue Apr 22 13:30:34 2008 Subject: [Haskell-cafe] ANNOUNCE: Galois web libraries for Haskell released In-Reply-To: <200804221232.51444.jgoerzen@complete.org> References: <20080421181119.GB8601@scytale.galois.com> <200804220820.10015.jgoerzen@complete.org> <20080422172034.GA27147@scytale.galois.com> <200804221232.51444.jgoerzen@complete.org> Message-ID: <20080422173510.GC27147@scytale.galois.com> jgoerzen: > On Tue April 22 2008 12:20:34 pm Don Stewart wrote: > > > Yes, we needed full, low-level access to sqlite for some unusual use > > cases. For high level stuff, HDBC and Takusen are nicer. > > Can you elaborate on these use cases? I would like to either add support for > them to HDBC-sqlite3, or perhaps make HDBC-sqlite3 a wrapper around your > library. Strange hardware. So we needed to be able to monkey around somewhat. It might make sense to wrap our sqlite3 binding with HDBC-sqlite3 though, so you don't need to maintain your own sqlite binding. -- Don From marlowsd at gmail.com Tue Apr 22 18:19:36 2008 From: marlowsd at gmail.com (Simon Marlow) Date: Tue Apr 22 18:14:37 2008 Subject: Proposal: overhaul System.Process Message-ID: <480E6478.5040209@gmail.com> I've made some improvements to System.Process that I'd like to get feedback on. Everything so far is backwards compatible in the sense that I've only added to the API - everything that was there before is still available, with the same semantics (except where bugs have been fixed). Haddock for the proposed new System.Process: http://darcs.haskell.org/~simonmar/process/System-Process.html Ticket: http://hackage.haskell.org/trac/ghc/ticket/2233 Discussion period: 4 weeks (20 May) Summary of changes: Tue Apr 22 15:02:16 PDT 2008 Simon Marlow * Overhall System.Process - fix #1780: pipes created by runInteractiveProcess are set close-on-exec by default - add a new, more general, form of process creation: createProcess Each of stdin, stdout and stderr may individually be taken from existing Handles or attached to new pipes. Also it has a nicer API. - add readProcess from Don Stewart's newpopen package. This function behaves like C's popen(). - Move System.Cmd.{system,rawSystem} into System.Process. Later we can depecate System.Cmd. - Don't use O_NONBLOCK for pipes, as it can confuse the process attached to the pipe (requires a fix to GHC.Handle in the base package). - move the tests from the GHC testsuite into the package itself, and add a couple more - bump the version to 2.0 From ndmitchell at gmail.com Tue Apr 22 18:29:21 2008 From: ndmitchell at gmail.com (Neil Mitchell) Date: Tue Apr 22 18:24:12 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <480E6478.5040209@gmail.com> References: <480E6478.5040209@gmail.com> Message-ID: <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> Hi > I've made some improvements to System.Process that I'd like to get feedback > on. It looks a lot nicer! I may be able to stop my standard trick of system "cmd > stdout.txt 2> stderr.txt" then readFile. The only function I was a bit concerned with was readProcess: readProcess :: FilePath -> [String] -> String -> IO (Either ExitCode String) I would have thought (ExitCode,String) was more appropriate. This interface means that readProcess cannot be lazy, as it must have the ExitCode before it generates the Right. Additionally, its probably quite important to have the output if something fails. I'd also like clarification if the result string is the stdout handle, or both stdout and stderr - I can see arguments for both variants, so perhaps both could be provided? Thanks Neil From bos at serpentine.com Tue Apr 22 18:35:42 2008 From: bos at serpentine.com (Bryan O'Sullivan) Date: Tue Apr 22 18:30:33 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> References: <480E6478.5040209@gmail.com> <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> Message-ID: <480E683E.10703@serpentine.com> Neil Mitchell wrote: > I would have thought (ExitCode,String) was more appropriate. Yes, definitely. What happens to stderr with this function, by the way? Is it tied to stdout (probably the right thing to do), or to /dev/null, or is it closed (eek!)? The haddock should make that clear. It would be useful if there was a readProcess variant that gave back a String each for stdout and stderr. References: <480E6478.5040209@gmail.com> <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> <480E683E.10703@serpentine.com> Message-ID: <480E6A87.4030702@gmail.com> Bryan O'Sullivan wrote: > Neil Mitchell wrote: > >> I would have thought (ExitCode,String) was more appropriate. > > Yes, definitely. Good point. Although I'm not sure I'm keen on readProcess being lazy (but you can have a lazy variant if you want). > What happens to stderr with this function, by the way? > Is it tied to stdout (probably the right thing to do), or to /dev/null, > or is it closed (eek!)? None of the above :) Currently it's inherited from the parent. Unfortunately it's not easy to tie stderr and stdout to the same pipe - createProcess can't do that, and readProcess is defined in terms of it. > It would be useful if there was a readProcess variant that gave back a > String each for stdout and stderr. Would it be reasonable for that to be the only variant? Cheers, Simon From duncan.coutts at worc.ox.ac.uk Tue Apr 22 18:49:42 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Tue Apr 22 18:44:36 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <480E683E.10703@serpentine.com> References: <480E6478.5040209@gmail.com> <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> <480E683E.10703@serpentine.com> Message-ID: <1208904582.27748.117.camel@localhost> On Tue, 2008-04-22 at 15:35 -0700, Bryan O'Sullivan wrote: > Neil Mitchell wrote: > > > I would have thought (ExitCode,String) was more appropriate. > > Yes, definitely. Yes, I mentioned this to Don previously when he published his popen code. I think he agreed. Duncan From dons at galois.com Tue Apr 22 18:52:17 2008 From: dons at galois.com (Don Stewart) Date: Tue Apr 22 18:47:12 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <1208904582.27748.117.camel@localhost> References: <480E6478.5040209@gmail.com> <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> <480E683E.10703@serpentine.com> <1208904582.27748.117.camel@localhost> Message-ID: <20080422225217.GJ27147@scytale.galois.com> duncan.coutts: > > On Tue, 2008-04-22 at 15:35 -0700, Bryan O'Sullivan wrote: > > Neil Mitchell wrote: > > > > > I would have thought (ExitCode,String) was more appropriate. > > > > Yes, definitely. > > Yes, I mentioned this to Don previously when he published his popen > code. I think he agreed. > > Duncan I'd changed, but not pushed out, process-light: -- -- | readProcess forks an external process, reads its standard output -- strictly, blocking until the process terminates, and returns either the output -- string, or, in the case of non-zero exit status, an error code, and -- any output. -- -- Output is returned strictly, so this is not suitable for -- interactive applications. -- -- Users of this library should compile with -threaded if they -- want other Haskell threads to keep running while waiting on -- the result of readProcess. -- -- > > readProcess "date" [] [] -- > Right "Thu Feb 7 10:03:39 PST 2008\n" -- -- The argumenst are: -- -- * The command to run, which must be in the $PATH, or an absolute path -- -- * A list of separate command line arguments to the program -- -- * A string to pass on the standard input to the program. -- readProcess :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO (Either (ExitCode,String) String) -- ^ either the stdout, or an exitcode and any output readProcess cmd args input = C.handle (return . handler) $ do (inh,outh,errh,pid) <- runInteractiveProcess cmd args Nothing Nothing output <- hGetContents outh outMVar <- newEmptyMVar forkIO $ (C.evaluate (length output) >> putMVar outMVar ()) when (not (null input)) $ hPutStr inh input takeMVar outMVar ex <- C.catch (waitForProcess pid) (\_e -> return ExitSuccess) hClose outh hClose inh -- done with stdin hClose errh -- ignore stderr return $ case ex of ExitSuccess -> Right output ExitFailure _ -> Left (ex, output) where handler (C.ExitException e) = Left (e,"") handler e = Left (ExitFailure 1, show e) From ndmitchell at gmail.com Tue Apr 22 18:54:58 2008 From: ndmitchell at gmail.com (Neil Mitchell) Date: Tue Apr 22 18:49:50 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <480E6A87.4030702@gmail.com> References: <480E6478.5040209@gmail.com> <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> <480E683E.10703@serpentine.com> <480E6A87.4030702@gmail.com> Message-ID: <404396ef0804221554x68e08082i46cd487f4cc8ff52@mail.gmail.com> Hi > > What happens to stderr with this function, by the way? > > Is it tied to stdout (probably the right thing to do), or to /dev/null, > > or is it closed (eek!)? > > > > None of the above :) Currently it's inherited from the parent. > Unfortunately it's not easy to tie stderr and stdout to the same pipe - > createProcess can't do that, and readProcess is defined in terms of it. It would be useful if they were tied, but not essential - its still a big improvement over currently. > > It would be useful if there was a readProcess variant that gave back a > > String each for stdout and stderr. > > Would it be reasonable for that to be the only variant? If you are implementing this function strictly, then that should be sufficient. If it were lazy you'd probably want three variants: 1) Only return stdout, and dump stderr onto the normal stderr. 2) Return both stdout and stderr separately. 3) Tie stdout and stderr. I guess people who want laziness can implement it themselves directly, taking care to get whatever laziness it is that they want. Thanks Neil From duncan.coutts at worc.ox.ac.uk Tue Apr 22 19:01:24 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Tue Apr 22 18:56:17 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <480E6478.5040209@gmail.com> References: <480E6478.5040209@gmail.com> Message-ID: <1208905284.27748.126.camel@localhost> On Tue, 2008-04-22 at 15:19 -0700, Simon Marlow wrote: > I've made some improvements to System.Process that I'd like to get > feedback on. Everything so far is backwards compatible in the sense > that I've only added to the API - everything that was there before is > still available, with the same semantics (except where bugs have been > fixed). > > Haddock for the proposed new System.Process: > > http://darcs.haskell.org/~simonmar/process/System-Process.html Looks good. > Summary of changes: > > Tue Apr 22 15:02:16 PDT 2008 Simon Marlow > * Overhall System.Process > > - fix #1780: pipes created by runInteractiveProcess are set > close-on-exec by default > > - add a new, more general, form of process creation: createProcess > Each of stdin, stdout and stderr may individually be taken > from existing Handles or attached to new pipes. Also it > has a nicer API. Yay! > - add readProcess from Don Stewart's newpopen package. This > function behaves like C's popen(). I'll double check that we can use this in Cabal where we currently have to implement something similar using #ifdef, doing it differntly for ghc vs nhc/hugs due to different compilers implementing different apis and the ghc api not being usable without pre-emptive threads (iirc). Our current function is :: FilePath -> [String] -> IO (String, ExitCode) So that connects stdin to /dev/null, I expect we can implement that in terms of the new createProcess. > - Move System.Cmd.{system,rawSystem} into System.Process. Later > we can depecate System.Cmd. Do you suppose we can rename the system/rawSystem given that we're already moving them from one module to another? Just off the top of my head, how about "runShellCommand" & "runProgram", better suggestions welcome. > - Don't use O_NONBLOCK for pipes, as it can confuse the process > attached to the pipe (requires a fix to GHC.Handle in the base > package). > > - move the tests from the GHC testsuite into the package itself, > and add a couple more > > - bump the version to 2.0 From duncan.coutts at worc.ox.ac.uk Tue Apr 22 19:07:12 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Tue Apr 22 19:02:06 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <480E6A87.4030702@gmail.com> References: <480E6478.5040209@gmail.com> <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> <480E683E.10703@serpentine.com> <480E6A87.4030702@gmail.com> Message-ID: <1208905632.27748.128.camel@localhost> On Tue, 2008-04-22 at 15:45 -0700, Simon Marlow wrote: > > What happens to stderr with this function, by the way? > > Is it tied to stdout (probably the right thing to do), or to /dev/null, > > or is it closed (eek!)? > > None of the above :) Currently it's inherited from the parent. > Unfortunately it's not easy to tie stderr and stdout to the same pipe - > createProcess can't do that, and readProcess is defined in terms of it. I don't understand the restriction. What if we just pass stdout as the handle to use for stdout and stderr. The types say that's possible, so what would go wrong? Duncan From duncan.coutts at worc.ox.ac.uk Tue Apr 22 19:08:43 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Tue Apr 22 19:03:38 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <20080422225217.GJ27147@scytale.galois.com> References: <480E6478.5040209@gmail.com> <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> <480E683E.10703@serpentine.com> <1208904582.27748.117.camel@localhost> <20080422225217.GJ27147@scytale.galois.com> Message-ID: <1208905723.27748.131.camel@localhost> On Tue, 2008-04-22 at 15:52 -0700, Don Stewart wrote: > duncan.coutts: > > > > I would have thought (ExitCode,String) was more appropriate. > > > > > > Yes, definitely. > > > > Yes, I mentioned this to Don previously when he published his popen > > code. I think he agreed. > I'd changed, but not pushed out, process-light: > readProcess :: FilePath -- ^ command to run > -> [String] -- ^ any arguments > -> String -- ^ standard input > -> IO (Either (ExitCode,String) String) -- ^ either the stdout, or an exitcode and any output You don't need the Either. ExitCode already covers the case when the process terminates successfully. Duncan From dons at galois.com Tue Apr 22 19:09:56 2008 From: dons at galois.com (Don Stewart) Date: Tue Apr 22 19:04:50 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <1208905723.27748.131.camel@localhost> References: <480E6478.5040209@gmail.com> <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> <480E683E.10703@serpentine.com> <1208904582.27748.117.camel@localhost> <20080422225217.GJ27147@scytale.galois.com> <1208905723.27748.131.camel@localhost> Message-ID: <20080422230956.GK27147@scytale.galois.com> duncan.coutts: > > On Tue, 2008-04-22 at 15:52 -0700, Don Stewart wrote: > > duncan.coutts: > > > > > > I would have thought (ExitCode,String) was more appropriate. > > > > > > > > Yes, definitely. > > > > > > Yes, I mentioned this to Don previously when he published his popen > > > code. I think he agreed. > > > I'd changed, but not pushed out, process-light: > > > readProcess :: FilePath -- ^ command to run > > -> [String] -- ^ any arguments > > -> String -- ^ standard input > > -> IO (Either (ExitCode,String) String) -- ^ either the stdout, or an exitcode and any output > > You don't need the Either. ExitCode already covers the case when the > process terminates successfully. > But we want to force people to check the failure case. Just returning the tuple doesn't help there. -- Don From marlowsd at gmail.com Tue Apr 22 19:25:14 2008 From: marlowsd at gmail.com (Simon Marlow) Date: Tue Apr 22 19:20:07 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <1208905632.27748.128.camel@localhost> References: <480E6478.5040209@gmail.com> <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> <480E683E.10703@serpentine.com> <480E6A87.4030702@gmail.com> <1208905632.27748.128.camel@localhost> Message-ID: <480E73DA.8010805@gmail.com> Duncan Coutts wrote: > On Tue, 2008-04-22 at 15:45 -0700, Simon Marlow wrote: > >>> What happens to stderr with this function, by the way? >>> Is it tied to stdout (probably the right thing to do), or to /dev/null, >>> or is it closed (eek!)? >> None of the above :) Currently it's inherited from the parent. >> Unfortunately it's not easy to tie stderr and stdout to the same pipe - >> createProcess can't do that, and readProcess is defined in terms of it. > > I don't understand the restriction. What if we just pass stdout as the > handle to use for stdout and stderr. The types say that's possible, so > what would go wrong? Yes that's possible, but what you can't do is create a new pipe and attach both stdout and stderr to it. Cheers, Simon From duncan.coutts at worc.ox.ac.uk Tue Apr 22 19:28:44 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Tue Apr 22 19:23:37 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <20080422230956.GK27147@scytale.galois.com> References: <480E6478.5040209@gmail.com> <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> <480E683E.10703@serpentine.com> <1208904582.27748.117.camel@localhost> <20080422225217.GJ27147@scytale.galois.com> <1208905723.27748.131.camel@localhost> <20080422230956.GK27147@scytale.galois.com> Message-ID: <1208906924.27748.148.camel@localhost> On Tue, 2008-04-22 at 16:09 -0700, Don Stewart wrote: > duncan.coutts: > > > readProcess :: FilePath -- ^ command to run > > > -> [String] -- ^ any arguments > > > -> String -- ^ standard input > > > -> IO (Either (ExitCode,String) String) -- ^ either the stdout, or an exitcode and any output > > > > You don't need the Either. ExitCode already covers the case when the > > process terminates successfully. > > > > But we want to force people to check the failure case. Just returning > the tuple doesn't help there. In Cabal we have two versions: usualConvenientVersion :: FilePath -> [String] -> IO String moreGeneralVersion :: FilePath -> [String] -> IO (String, ExitCode) The first version - that we expect to use most often - just throws an exception if the exit code is non-0. In our experience this is almost always the right thing to do. There is only one place in Cabal where we expect the command to fail but we need the output anyway. We previously had all our process functions return an ExitCode and they were routinely ignored. I suppose the fact that with readProcess people will be interested in the result does help the situation. Duncan From marlowsd at gmail.com Tue Apr 22 19:35:22 2008 From: marlowsd at gmail.com (Simon Marlow) Date: Tue Apr 22 19:30:12 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <1208905284.27748.126.camel@localhost> References: <480E6478.5040209@gmail.com> <1208905284.27748.126.camel@localhost> Message-ID: <480E763A.6080201@gmail.com> Duncan Coutts wrote: > Do you suppose we can rename the system/rawSystem given that we're > already moving them from one module to another? > > Just off the top of my head, how about "runShellCommand" & "runProgram", > better suggestions welcome. Well, ideally we'd do a complete renaming sweep, e.g. runProcess should be spawnProcess (or just removed entirely), then we could use runProcess for what is currently called rawSystem. But I've got enough flak for changing APIs in the past so I wimped out this time :-) runShellCommand for system is not good, because we already have runCommand which is the same except that it doesn't wait for completion. Something like runProcessAndWait would make sense, perhaps, but that's a mouthful. Cheers, Simon From john at repetae.net Tue Apr 22 22:41:36 2008 From: john at repetae.net (John Meacham) Date: Tue Apr 22 22:36:25 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <20080422230956.GK27147@scytale.galois.com> References: <480E6478.5040209@gmail.com> <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> <480E683E.10703@serpentine.com> <1208904582.27748.117.camel@localhost> <20080422225217.GJ27147@scytale.galois.com> <1208905723.27748.131.camel@localhost> <20080422230956.GK27147@scytale.galois.com> Message-ID: <20080423024136.GU17560@sliver.repetae.net> On Tue, Apr 22, 2008 at 04:09:56PM -0700, Don Stewart wrote: > duncan.coutts: > > > > On Tue, 2008-04-22 at 15:52 -0700, Don Stewart wrote: > > > duncan.coutts: > > > > > > > > I would have thought (ExitCode,String) was more appropriate. > > > > > > > > > > Yes, definitely. > > > > > > > > Yes, I mentioned this to Don previously when he published his popen > > > > code. I think he agreed. > > > > > I'd changed, but not pushed out, process-light: > > > > > readProcess :: FilePath -- ^ command to run > > > -> [String] -- ^ any arguments > > > -> String -- ^ standard input > > > -> IO (Either (ExitCode,String) String) -- ^ either the stdout, or an exitcode and any output > > > > You don't need the Either. ExitCode already covers the case when the > > process terminates successfully. > > > > But we want to force people to check the failure case. Just returning > the tuple doesn't help there. But it is much more elegant, cleaner code, and more in line with the underlying semantics. A general library API shouldn't force complexity on its users. Also, it has two redundant cases (Left (ExitSuccess,out)) and (Right out), which is a far worse bug in an API. Also, it means you can't have lazy output. since you won't know the error code until the process has finished completely. John -- John Meacham - ?repetae.net?john? From dons at galois.com Tue Apr 22 22:45:38 2008 From: dons at galois.com (Don Stewart) Date: Tue Apr 22 22:40:30 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <20080423024136.GU17560@sliver.repetae.net> References: <480E6478.5040209@gmail.com> <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> <480E683E.10703@serpentine.com> <1208904582.27748.117.camel@localhost> <20080422225217.GJ27147@scytale.galois.com> <1208905723.27748.131.camel@localhost> <20080422230956.GK27147@scytale.galois.com> <20080423024136.GU17560@sliver.repetae.net> Message-ID: <20080423024538.GA18247@scytale.galois.com> john: > On Tue, Apr 22, 2008 at 04:09:56PM -0700, Don Stewart wrote: > > duncan.coutts: > > > > > > On Tue, 2008-04-22 at 15:52 -0700, Don Stewart wrote: > > > > duncan.coutts: > > > > > > > > > > I would have thought (ExitCode,String) was more appropriate. > > > > > > > > > > > > Yes, definitely. > > > > > > > > > > Yes, I mentioned this to Don previously when he published his popen > > > > > code. I think he agreed. > > > > > > > I'd changed, but not pushed out, process-light: > > > > > > > readProcess :: FilePath -- ^ command to run > > > > -> [String] -- ^ any arguments > > > > -> String -- ^ standard input > > > > -> IO (Either (ExitCode,String) String) -- ^ either the stdout, or an exitcode and any output > > > > > > You don't need the Either. ExitCode already covers the case when the > > > process terminates successfully. > > > > > > > But we want to force people to check the failure case. Just returning > > the tuple doesn't help there. > > But it is much more elegant, cleaner code, and more in line with the > underlying semantics. A general library API shouldn't force > complexity on its users. Also, it has two redundant cases (Left > (ExitSuccess,out)) and (Right out), which is a far worse bug in an API. > > Also, it means you can't have lazy output. since you won't know the > error code until the process has finished completely. > Yeah, this was originally written for lambdabot, where lazy output just wasn't an option -- and possibly dangerous. Finding a good type that encourages the kind of "correctness" approach to handling errors that we like in Haskell would be good though -- if we can improve safety, cheaply, let's do it! -- Don From john at repetae.net Tue Apr 22 22:54:04 2008 From: john at repetae.net (John Meacham) Date: Tue Apr 22 22:48:54 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <20080423024538.GA18247@scytale.galois.com> References: <480E6478.5040209@gmail.com> <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> <480E683E.10703@serpentine.com> <1208904582.27748.117.camel@localhost> <20080422225217.GJ27147@scytale.galois.com> <1208905723.27748.131.camel@localhost> <20080422230956.GK27147@scytale.galois.com> <20080423024136.GU17560@sliver.repetae.net> <20080423024538.GA18247@scytale.galois.com> Message-ID: <20080423025404.GV17560@sliver.repetae.net> On Tue, Apr 22, 2008 at 07:45:38PM -0700, Don Stewart wrote: > > But it is much more elegant, cleaner code, and more in line with the > > underlying semantics. A general library API shouldn't force > > complexity on its users. Also, it has two redundant cases (Left > > (ExitSuccess,out)) and (Right out), which is a far worse bug in an API. > > > > Also, it means you can't have lazy output. since you won't know the > > error code until the process has finished completely. > > > > Yeah, this was originally written for lambdabot, where lazy output just > wasn't an option -- and possibly dangerous. > > Finding a good type that encourages the kind of "correctness" approach > to handling errors that we like in Haskell would be good though -- > if we can improve safety, cheaply, let's do it! It seems more verbose and ambiguous than safe, because now you have to look up the documentation to figure out what the difference between (Left (ExitSuccess,s)) and (Right s) is, taking up precious, precious mindspace to remembering it and introducing another place a bug can be introduced. Code clarity does a lot more for correctness (and debugability) than dubious measures to improve some idea of safety. John -- John Meacham - ?repetae.net?john? From bos at serpentine.com Tue Apr 22 23:40:09 2008 From: bos at serpentine.com (Bryan O'Sullivan) Date: Tue Apr 22 23:34:59 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <1208906924.27748.148.camel@localhost> References: <480E6478.5040209@gmail.com> <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> <480E683E.10703@serpentine.com> <1208904582.27748.117.camel@localhost> <20080422225217.GJ27147@scytale.galois.com> <1208905723.27748.131.camel@localhost> <20080422230956.GK27147@scytale.galois.com> <1208906924.27748.148.camel@localhost> Message-ID: <480EAF99.7040801@serpentine.com> Duncan Coutts wrote: > In Cabal we have two versions: > > usualConvenientVersion :: FilePath -> [String] -> IO String > > moreGeneralVersion :: FilePath -> [String] -> IO (String, ExitCode) I have written those two functions perhaps twenty times, with the semantics that Duncan describes, in maybe four different languages. It would indeed be very nice to have them in the standard toolbox :-) References: <480E6478.5040209@gmail.com> <1208905284.27748.126.camel@localhost> <480E763A.6080201@gmail.com> Message-ID: <1208942866.27748.162.camel@localhost> On Tue, 2008-04-22 at 16:35 -0700, Simon Marlow wrote: > Duncan Coutts wrote: > > > Do you suppose we can rename the system/rawSystem given that we're > > already moving them from one module to another? > > > > Just off the top of my head, how about "runShellCommand" & "runProgram", > > better suggestions welcome. > > Well, ideally we'd do a complete renaming sweep, e.g. runProcess should > be spawnProcess (or just removed entirely), then we could use runProcess > for what is currently called rawSystem. But I've got enough flak for > changing APIs in the past so I wimped out this time :-) Ah but this isn't a change, it's a new api, so we have complete freedom. We're adding a new replacement for system/rawSystem and deprecating the old module. Duncan From droundy at darcs.net Wed Apr 23 10:50:24 2008 From: droundy at darcs.net (David Roundy) Date: Wed Apr 23 10:45:13 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <20080423025404.GV17560@sliver.repetae.net> References: <480E6478.5040209@gmail.com> <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> <480E683E.10703@serpentine.com> <1208904582.27748.117.camel@localhost> <20080422225217.GJ27147@scytale.galois.com> <1208905723.27748.131.camel@localhost> <20080422230956.GK27147@scytale.galois.com> <20080423024136.GU17560@sliver.repetae.net> <20080423024538.GA18247@scytale.galois.com> <20080423025404.GV17560@sliver.repetae.net> Message-ID: <117f2cc80804230750t64f751c0w452df87ee513ecef@mail.gmail.com> On Tue, Apr 22, 2008 at 7:54 PM, John Meacham wrote: > On Tue, Apr 22, 2008 at 07:45:38PM -0700, Don Stewart wrote: > > Finding a good type that encourages the kind of "correctness" approach > > to handling errors that we like in Haskell would be good though -- > > if we can improve safety, cheaply, let's do it! > > It seems more verbose and ambiguous than safe, because now you have to > look up the documentation to figure out what the difference between > (Left (ExitSuccess,s)) and (Right s) is, taking up precious, precious > mindspace to remembering it and introducing another place a bug can be > introduced. Code clarity does a lot more for correctness (and > debugability) than dubious measures to improve some idea of safety. Personally, I'd rather have a version that just throws an exception when the exit code is non-zero. As Duncan mentioned, this is usually what you want to do. Given that the IO monad already has pretty nice (and flexible) error handling, and that this is only a convenience function, which is easily implemented in terms of createProcess, it seems like we should make it actually be convenient. Using Either for error handling means that we can't use this for "simple" cases where the right thing is to fail when the function fails. Using a tuple as the output means that for "simple" cases, folks will almost always do the wrong thing, which is to ignore errors. David From marlowsd at gmail.com Wed Apr 23 14:29:42 2008 From: marlowsd at gmail.com (Simon Marlow) Date: Wed Apr 23 14:24:40 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <117f2cc80804230750t64f751c0w452df87ee513ecef@mail.gmail.com> References: <480E6478.5040209@gmail.com> <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> <480E683E.10703@serpentine.com> <1208904582.27748.117.camel@localhost> <20080422225217.GJ27147@scytale.galois.com> <1208905723.27748.131.camel@localhost> <20080422230956.GK27147@scytale.galois.com> <20080423024136.GU17560@sliver.repetae.net> <20080423024538.GA18247@scytale.galois.com> <20080423025404.GV17560@sliver.repetae.net> <117f2cc80804230750t64f751c0w452df87ee513ecef@mail.gmail.com> Message-ID: <480F8016.7040807@gmail.com> David Roundy wrote: > On Tue, Apr 22, 2008 at 7:54 PM, John Meacham wrote: >> On Tue, Apr 22, 2008 at 07:45:38PM -0700, Don Stewart wrote: >> > Finding a good type that encourages the kind of "correctness" approach >> > to handling errors that we like in Haskell would be good though -- >> > if we can improve safety, cheaply, let's do it! >> >> It seems more verbose and ambiguous than safe, because now you have to >> look up the documentation to figure out what the difference between >> (Left (ExitSuccess,s)) and (Right s) is, taking up precious, precious >> mindspace to remembering it and introducing another place a bug can be >> introduced. Code clarity does a lot more for correctness (and >> debugability) than dubious measures to improve some idea of safety. > > Personally, I'd rather have a version that just throws an exception > when the exit code is non-zero. As Duncan mentioned, this is usually > what you want to do. Given that the IO monad already has pretty nice > (and flexible) error handling, and that this is only a convenience > function, which is easily implemented in terms of createProcess, it > seems like we should make it actually be convenient. Using Either for > error handling means that we can't use this for "simple" cases where > the right thing is to fail when the function fails. Using a tuple as > the output means that for "simple" cases, folks will almost always do > the wrong thing, which is to ignore errors. Ok, here's the new proposal. readProcess :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO String -- ^ stdout + stderr readProcessMayFail :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO (ExitCode,String) -- ^ exitcode, and stdout + stderr It turns out to be dead easy to bind stderr and stdout to the same pipe. After a couple of minor tweaks the following now works: createProcess (proc cmd args){ std_out = CreatePipe, std_err = UseHandle stdout } So now we have: Prelude System.Process> readProcessMayFail "ls" ["/foo"] "" (ExitFailure 2,"ls: /foo: No such file or directory\n") Prelude System.Process> readProcess "ls" ["/foo"] "" *** Exception: readProcess: ls: failed Look ok? Incedentally, for those that know of such things, should readProcess do the same signal management that system currently does? That is, ignore SIGINT and SIGQUIT in the parent and restore them to the default in the child? Cheers, Simon From bos at serpentine.com Wed Apr 23 14:55:36 2008 From: bos at serpentine.com (Bryan O'Sullivan) Date: Wed Apr 23 14:50:24 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <480F8016.7040807@gmail.com> References: <480E6478.5040209@gmail.com> <404396ef0804221529t57363ae0qe5f2fd99855c5d96@mail.gmail.com> <480E683E.10703@serpentine.com> <1208904582.27748.117.camel@localhost> <20080422225217.GJ27147@scytale.galois.com> <1208905723.27748.131.camel@localhost> <20080422230956.GK27147@scytale.galois.com> <20080423024136.GU17560@sliver.repetae.net> <20080423024538.GA18247@scytale.galois.com> <20080423025404.GV17560@sliver.repetae.net> <117f2cc80804230750t64f751c0w452df87ee513ecef@mail.gmail.com> <480F8016.7040807@gmail.com> Message-ID: <480F8628.2040306@serpentine.com> Simon Marlow wrote: > Incedentally, for those that know of such things, should readProcess do > the same signal management that system currently does? That is, ignore > SIGINT and SIGQUIT in the parent and restore them to the default in the > child? Why does system do that in the first place? Are we not calling the underlying platform's system(3)? Since these functions are supposed to be similar to popen(3), they shouldn't touch signals. The POSIX.2 rationale explicitly states that popen implementations that mess with the parent's signals while waiting for the child are non-conforming. References: <480E683E.10703@serpentine.com> <1208904582.27748.117.camel@localhost> <20080422225217.GJ27147@scytale.galois.com> <1208905723.27748.131.camel@localhost> <20080422230956.GK27147@scytale.galois.com> <20080423024136.GU17560@sliver.repetae.net> <20080423024538.GA18247@scytale.galois.com> <20080423025404.GV17560@sliver.repetae.net> <117f2cc80804230750t64f751c0w452df87ee513ecef@mail.gmail.com> <480F8016.7040807@gmail.com> Message-ID: <20080423185847.GC8763@darcs.net> On Wed, Apr 23, 2008 at 11:29:42AM -0700, Simon Marlow wrote: > So now we have: > > Prelude System.Process> readProcessMayFail "ls" ["/foo"] "" > (ExitFailure 2,"ls: /foo: No such file or directory\n") > Prelude System.Process> readProcess "ls" ["/foo"] "" > *** Exception: readProcess: ls: failed > > Look ok? Looks fine as an API. As an implementation, I'd prefer for the exception thrown to include stderr (and wouldn't mind if the output didn't include stderr). It'd be much nicer if we had: Prelude System.Process> readProcess "ls" ["/foo"] "" *** Exception: readProcess: ls /foo: No such file or directory This would mean that correct programs could use readProcess without sacrificing nice feedback when something unusual happens. Of course, we can't guarantee that stderr will give any hint as to what went wrong, but that's not our bug. We could also potentially include both stdout and stderr, or just the last few lines of the stdout/stderr combination. But it'd be nice to be able to use readProcess rather than being forced to write our own in order to give better error messages to our users. -- David Roundy Department of Physics Oregon State University From marlowsd at gmail.com Wed Apr 23 16:27:24 2008 From: marlowsd at gmail.com (Simon Marlow) Date: Wed Apr 23 16:22:13 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <20080423185847.GC8763@darcs.net> References: <480E683E.10703@serpentine.com> <1208904582.27748.117.camel@localhost> <20080422225217.GJ27147@scytale.galois.com> <1208905723.27748.131.camel@localhost> <20080422230956.GK27147@scytale.galois.com> <20080423024136.GU17560@sliver.repetae.net> <20080423024538.GA18247@scytale.galois.com> <20080423025404.GV17560@sliver.repetae.net> <117f2cc80804230750t64f751c0w452df87ee513ecef@mail.gmail.com> <480F8016.7040807@gmail.com> <20080423185847.GC8763@darcs.net> Message-ID: <480F9BAC.1010003@gmail.com> David Roundy wrote: > On Wed, Apr 23, 2008 at 11:29:42AM -0700, Simon Marlow wrote: >> So now we have: >> >> Prelude System.Process> readProcessMayFail "ls" ["/foo"] "" >> (ExitFailure 2,"ls: /foo: No such file or directory\n") >> Prelude System.Process> readProcess "ls" ["/foo"] "" >> *** Exception: readProcess: ls: failed >> >> Look ok? > > Looks fine as an API. As an implementation, I'd prefer for the exception > thrown to include stderr (and wouldn't mind if the output didn't include > stderr). It'd be much nicer if we had: > > Prelude System.Process> readProcess "ls" ["/foo"] "" > *** Exception: readProcess: ls /foo: No such file or directory > > This would mean that correct programs could use readProcess without > sacrificing nice feedback when something unusual happens. Of course, we > can't guarantee that stderr will give any hint as to what went wrong, but > that's not our bug. We could also potentially include both stdout and > stderr, or just the last few lines of the stdout/stderr combination. Yes, there are a couple of problems here: 1. stdout and stderr are tied together, so we don't know which parts of the output are stderr. 2. the output might be multi-line, and it's not clear how much or which parts to include. The easy answer is just "include it all", but then the error messages could get arbitrarily long and potentially include a lot of superfluous information. However, I can certainly include the arguments and the exit code in the exception, which I'm not currently doing. Cheers, Simon From droundy at darcs.net Wed Apr 23 16:36:51 2008 From: droundy at darcs.net (David Roundy) Date: Wed Apr 23 16:31:41 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <480F9BAC.1010003@gmail.com> References: <20080422225217.GJ27147@scytale.galois.com> <1208905723.27748.131.camel@localhost> <20080422230956.GK27147@scytale.galois.com> <20080423024136.GU17560@sliver.repetae.net> <20080423024538.GA18247@scytale.galois.com> <20080423025404.GV17560@sliver.repetae.net> <117f2cc80804230750t64f751c0w452df87ee513ecef@mail.gmail.com> <480F8016.7040807@gmail.com> <20080423185847.GC8763@darcs.net> <480F9BAC.1010003@gmail.com> Message-ID: <20080423203651.GK8763@darcs.net> On Wed, Apr 23, 2008 at 01:27:24PM -0700, Simon Marlow wrote: > David Roundy wrote: > >On Wed, Apr 23, 2008 at 11:29:42AM -0700, Simon Marlow wrote: > >>So now we have: > >> > >>Prelude System.Process> readProcessMayFail "ls" ["/foo"] "" > >>(ExitFailure 2,"ls: /foo: No such file or directory\n") > >>Prelude System.Process> readProcess "ls" ["/foo"] "" > >>*** Exception: readProcess: ls: failed > >> > >>Look ok? > > > >Looks fine as an API. As an implementation, I'd prefer for the exception > >thrown to include stderr (and wouldn't mind if the output didn't include > >stderr). It'd be much nicer if we had: > > > >Prelude System.Process> readProcess "ls" ["/foo"] "" > >*** Exception: readProcess: ls /foo: No such file or directory > > > >This would mean that correct programs could use readProcess without > >sacrificing nice feedback when something unusual happens. Of course, we > >can't guarantee that stderr will give any hint as to what went wrong, but > >that's not our bug. We could also potentially include both stdout and > >stderr, or just the last few lines of the stdout/stderr combination. > > Yes, there are a couple of problems here: > > 1. stdout and stderr are tied together, so we don't know which parts > of the output are stderr. > > 2. the output might be multi-line, and it's not clear how much or > which parts to include. > > The easy answer is just "include it all", but then the error messages > could get arbitrarily long and potentially include a lot of superfluous > information. > > However, I can certainly include the arguments and the exit code in the > exception, which I'm not currently doing. Why not then leave the stderr out of the output, and just print it to stderr? It's the standard location to send error output, and I'd hate to lose it. -- David Roundy Department of Physics Oregon State University From john at repetae.net Wed Apr 23 17:25:59 2008 From: john at repetae.net (John Meacham) Date: Wed Apr 23 17:20:46 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <480F8016.7040807@gmail.com> References: <480E683E.10703@serpentine.com> <1208904582.27748.117.camel@localhost> <20080422225217.GJ27147@scytale.galois.com> <1208905723.27748.131.camel@localhost> <20080422230956.GK27147@scytale.galois.com> <20080423024136.GU17560@sliver.repetae.net> <20080423024538.GA18247@scytale.galois.com> <20080423025404.GV17560@sliver.repetae.net> <117f2cc80804230750t64f751c0w452df87ee513ecef@mail.gmail.com> <480F8016.7040807@gmail.com> Message-ID: <20080423212559.GB17560@sliver.repetae.net> On Wed, Apr 23, 2008 at 11:29:42AM -0700, Simon Marlow wrote: > Ok, here's the new proposal. > > readProcess > :: FilePath -- ^ command to run > -> [String] -- ^ any arguments > -> String -- ^ standard input > -> IO String -- ^ stdout + stderr > > readProcessMayFail > :: FilePath -- ^ command to run > -> [String] -- ^ any arguments > -> String -- ^ standard input > -> IO (ExitCode,String) -- ^ exitcode, and stdout + stderr MayFail seems to be attached to the wrong one here. 'readProcess' is the one that might fail, the second actual call always succeeds but returns an error code. I think readProcessWithExitCode is better. John -- John Meacham - ?repetae.net?john? From marlowsd at gmail.com Thu Apr 24 12:11:23 2008 From: marlowsd at gmail.com (Simon Marlow) Date: Thu Apr 24 12:06:09 2008 Subject: Proposal: overhaul System.Process In-Reply-To: <20080423212559.GB17560@sliver.repetae.net> References: <480E683E.10703@serpentine.com> <20080422225217.GJ27147@scytale.galois.com> <1208905723.27748.131.camel@localhost> <20080422230956.GK27147@scytale.galois.com> <20080423024136.GU17560@sliver.repetae.net> <20080423024538.GA18247@scytale.galois.com> <20080423025404.GV17560@sliver.repetae.net> <117f2cc80804230750t64f751c0w452df87ee513ecef@mail.gmail.com> <480F8016.7040807@gmail.com> <20080423212559.GB17560@sliver.repetae.net> Message-ID: On 23/04/2008, John Meacham wrote: > > On Wed, Apr 23, 2008 at 11:29:42AM -0700, Simon Marlow wrote: > > > Ok, here's the new proposal. > > > > readProcess > > :: FilePath -- ^ command to run > > -> [String] -- ^ any arguments > > -> String -- ^ standard input > > -> IO String -- ^ stdout + stderr > > > > readProcessMayFail > > :: FilePath -- ^ command to run > > -> [String] -- ^ any arguments > > -> String -- ^ standard input > > -> IO (ExitCode,String) -- ^ exitcode, and stdout + stderr > > > MayFail seems to be attached to the wrong one here. 'readProcess' is the > one that might fail, the second actual call always succeeds but returns an > error code. I think readProcessWithExitCode is better. yes, well the idea was that you would use readProcessMayFail when you are anticipating that the process might fail. Still, I like your suggestion of readProcessWithExitCode better, so I'll go with that. Cheers, Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/libraries/attachments/20080424/54b600e1/attachment.htm From marlowsd at gmail.com Thu Apr 24 13:22:45 2008 From: marlowsd at gmail.com (Simon Marlow) Date: Thu Apr 24 13:18:08 2008 Subject: Data.Map improvement patch In-Reply-To: <480D22F6.3050301@iee.org> References: <20080421073842.GA24983@atrey.karlin.mff.cuni.cz> <480D22F6.3050301@iee.org> Message-ID: <4810C1E5.3000009@gmail.com> Adrian Hey wrote: > Milan Straka wrote: > >> Any comments are welcome, > > Maybe just trash the lot and use AVL trees instead? > > :-) Sure... but I think you forgot to attach the patch to that message ;-) Cheers, Simon From duncan.coutts at worc.ox.ac.uk Fri Apr 25 05:39:11 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Fri Apr 25 05:33:48 2008 Subject: darcs patch: Re-export (>>>), (<<<) from Control.Arrow for compatab... Message-ID: <20080425093344.ACC67324133@www.haskell.org> It was reported in #haskell that xmonad-contrib does not compile with ghc HEAD. We tracked it down to (>>>) no longer being exported from Control.Arrow. It is now a member of the Category superclass and defined in Control.Category but it is no longer exported from Control.Arrow. It seems that any code that does not define its own arrow instances would still be able to work unchanged if (>>>) and (<<<) were just to be re-exported from Control.Arrow. So here's a completely untested patch to do that. Duncan Fri Apr 25 10:11:13 BST 2008 Duncan Coutts * Re-export (>>>), (<<<) from Control.Arrow for compatability They're now defined in Control.Category but all existing code expects them o be exported from Control.Arrow. For example, this should unbreak xmonad-contrib. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/x-darcs-patch Size: 14777 bytes Desc: A darcs patch for your repository! Url : http://www.haskell.org/pipermail/libraries/attachments/20080425/8836d427/attachment.bin From kolmodin at gentoo.org Sat Apr 26 01:47:35 2008 From: kolmodin at gentoo.org (Lennart Kolmodin) Date: Sat Apr 26 01:42:16 2008 Subject: darcs patch: Change LICENSE from BSD4 to BSD3 Message-ID: <20080426054735.3607782E7@atum.ita.chalmers.se> Sat Apr 26 07:58:06 CEST 2008 Lennart Kolmodin * Change LICENSE from BSD4 to BSD3 Picking BSD4 over BSD3 is usually by confusion. See http://hackage.haskell.org/trac/hackage/ticket/205. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/x-darcs-patch Size: 3136 bytes Desc: A darcs patch for your repository! Url : http://www.haskell.org/pipermail/libraries/attachments/20080426/6346db16/attachment-0001.bin From david.maciver at gmail.com Sun Apr 27 13:32:57 2008 From: david.maciver at gmail.com (David MacIver) Date: Sun Apr 27 13:27:34 2008 Subject: Issue with package "pretty" Message-ID: #haskell have just helped me figure out an entertaining issue with the pretty library on hackage. I couldn't figure out the right place to file a bug report and the maintainer is listed as libraries@haskell.org so I'm emailing the list in the hopes that someone will know what to do about it. :-) The problem is that if you install pretty, Cabal will suddenly cease working until you recompile it. It depends on a version of pretty which gets bundled with GHC is purportedly of the same version as the one on hackage, but the two aren't binary compatible. So when it tries to use it, everything goes boom. From igloo at earth.li Sun Apr 27 13:47:35 2008 From: igloo at earth.li (Ian Lynagh) Date: Sun Apr 27 13:42:12 2008 Subject: Issue with package "pretty" In-Reply-To: References: Message-ID: <20080427174735.GA31679@matrix.chaos.earth.li> On Sun, Apr 27, 2008 at 06:32:57PM +0100, David MacIver wrote: > > The problem is that if you install pretty, Cabal will suddenly cease > working until you recompile it. It depends on a version of pretty > which gets bundled with GHC is purportedly of the same version as the > one on hackage, but the two aren't binary compatible. So when it tries > to use it, everything goes boom. If you compile with different flags then the results will not necessarily be binary compatible. I suspect that's what happened here. Thanks Ian From duncan.coutts at worc.ox.ac.uk Sun Apr 27 15:42:01 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Sun Apr 27 15:36:24 2008 Subject: Issue with package "pretty" In-Reply-To: <20080427174735.GA31679@matrix.chaos.earth.li> References: <20080427174735.GA31679@matrix.chaos.earth.li> Message-ID: <1209325321.30059.38.camel@localhost> On Sun, 2008-04-27 at 18:47 +0100, Ian Lynagh wrote: > On Sun, Apr 27, 2008 at 06:32:57PM +0100, David MacIver wrote: > > > > The problem is that if you install pretty, Cabal will suddenly cease > > working until you recompile it. It depends on a version of pretty > > which gets bundled with GHC is purportedly of the same version as the > > one on hackage, but the two aren't binary compatible. So when it tries > > to use it, everything goes boom. > > If you compile with different flags then the results will not > necessarily be binary compatible. I suspect that's what happened here. Another way to look at the problem is that the user package basically allows interposing global packages and then other global packages will get linked against the user package not the global one they were built with. The solution is not to identify installed packages by their name and version but by a hash that identifies the package ABI. Another hack would be to maintain a more complex notion of package overlays and have dependencies within a package db be satisfied from the same package db or the one 'below' but never the one 'above'. Duncan From bulat.ziganshin at gmail.com Sun Apr 27 15:49:01 2008 From: bulat.ziganshin at gmail.com (Bulat Ziganshin) Date: Sun Apr 27 15:48:20 2008 Subject: Issue with package "pretty" In-Reply-To: <1209325321.30059.38.camel@localhost> References: <20080427174735.GA31679@matrix.chaos.earth.li> <1209325321.30059.38.camel@localhost> Message-ID: <1064329268.20080427234901@gmail.com> Hello Duncan, Sunday, April 27, 2008, 11:42:01 PM, you wrote: >> > The problem is that if you install pretty, Cabal will suddenly cease >> > working until you recompile it. It depends on a version of pretty >> > which gets bundled with GHC is purportedly of the same version as the >> > one on hackage, but the two aren't binary compatible. So when it tries >> > to use it, everything goes boom. > The solution is not to identify installed packages by their name and > version but by a hash that identifies the package ABI. probably not ABI (it should be the same), but random number generated when compilation (build) occurs -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com From duncan.coutts at worc.ox.ac.uk Sun Apr 27 16:25:51 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Sun Apr 27 16:20:11 2008 Subject: Issue with package "pretty" In-Reply-To: <1064329268.20080427234901@gmail.com> References: <20080427174735.GA31679@matrix.chaos.earth.li> <1209325321.30059.38.camel@localhost> <1064329268.20080427234901@gmail.com> Message-ID: <1209327951.30059.43.camel@localhost> On Sun, 2008-04-27 at 23:49 +0400, Bulat Ziganshin wrote: > Hello Duncan, > > Sunday, April 27, 2008, 11:42:01 PM, you wrote: > > >> > The problem is that if you install pretty, Cabal will suddenly cease > >> > working until you recompile it. It depends on a version of pretty > >> > which gets bundled with GHC is purportedly of the same version as the > >> > one on hackage, but the two aren't binary compatible. So when it tries > >> > to use it, everything goes boom. > > > The solution is not to identify installed packages by their name and > > version but by a hash that identifies the package ABI. > > probably not ABI (it should be the same), but random number generated > when compilation (build) occurs The package name and version do not identify an ABI. You change the ABI by using -O0 vs -O1. Also, I think modeling build processes as randomness isn't the right approach. Build are really pure functions but with a rather large number of inputs. The Nix approach is to identify them all and calculate a hash of them all and use that to identify each built package. Duncan From david.maciver at gmail.com Sun Apr 27 16:29:32 2008 From: david.maciver at gmail.com (David MacIver) Date: Sun Apr 27 16:24:08 2008 Subject: Issue with package "pretty" In-Reply-To: <1209327951.30059.43.camel@localhost> References: <20080427174735.GA31679@matrix.chaos.earth.li> <1209325321.30059.38.camel@localhost> <1064329268.20080427234901@gmail.com> <1209327951.30059.43.camel@localhost> Message-ID: On Sun, Apr 27, 2008 at 9:25 PM, Duncan Coutts wrote: > > On Sun, 2008-04-27 at 23:49 +0400, Bulat Ziganshin wrote: > > Hello Duncan, > > > > Sunday, April 27, 2008, 11:42:01 PM, you wrote: > > > > >> > The problem is that if you install pretty, Cabal will suddenly cease > > >> > working until you recompile it. It depends on a version of pretty > > >> > which gets bundled with GHC is purportedly of the same version as the > > >> > one on hackage, but the two aren't binary compatible. So when it tries > > >> > to use it, everything goes boom. > > > > > The solution is not to identify installed packages by their name and > > > version but by a hash that identifies the package ABI. > > > > probably not ABI (it should be the same), but random number generated > > when compilation (build) occurs > > The package name and version do not identify an ABI. You change the ABI > by using -O0 vs -O1. > > Also, I think modeling build processes as randomness isn't the right > approach. Build are really pure functions but with a rather large number > of inputs. The Nix approach is to identify them all and calculate a hash > of them all and use that to identify each built package. The main thing I'd like to borrow from the Nix approach is the ability to back out of the package change. I'm really more bothered about the fact that it left me with a nonfunctioning build system (until someone pointed out runghc would compile everything from source) than anything else. :-) From duncan.coutts at worc.ox.ac.uk Sun Apr 27 16:52:21 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Sun Apr 27 16:46:36 2008 Subject: Issue with package "pretty" In-R