From isaacdupree at charter.net Fri Aug 15 09:27:16 2008 From: isaacdupree at charter.net (Isaac Dupree) Date: Fri Aug 15 09:26:21 2008 Subject: Mutually-recursive/cyclic module imports Message-ID: <48A58434.40607@charter.net> Haskell-98 specifies that module import cycles work automatically with cross-module type inference. It has some weird interactions with defaulting and the monomorphism restriction. In Haskell-prime we're planning on removing artificial monomorphism, but defaulting will still be necessary (and can still be set differently per module). Only JHC fully implements the recursive module imports of Haskell-98. GHC and NYhc each have their own proprietary "boot-files" with slightly odd semantics to allow this to work (albeit the syntax is simple enough) Hugs doesn't support it at all. I propose we simplify things and lay down some rules, without having to invent explicit module-interface signatures. Then I wouldn't complain(:-)) that GHC doesn't have reasonable support for cyclic modules [1][2]. (Compiler writers will have to give feedback how plausible this is :-) -- I think GHC and NYhc "should" be able to adapt their boot-interface-file mechanisms to the scheme I'm proposing.. (This is really more of a sketch than a complete proposal at this stage.) In particular, I propose an amount of annotation in a module that *shall* make it compile. Compilers are free to accept code for other reasons (e.g. .hs-boot files, or some official module interfaces). These first proposals are clean-ups that reflect how ridiculous people think the current standard's module interface semantics are compared to most languages. Also they make cross-module type inference unnecessary, eliminating the defaulting problem. namespace level: Haskell98 says that what a module exports is determined by the smallest fix-point of what is possible. I can't see a practical use for this behavior, which is easily confusing. I think that exports that depend on the result of a fix-point should be rejected. It can be useful in module A to import a few types/functions explicitly from a module B that then goes on to export the whole of module A though. type level: Inside any given SCC (loop) of modules, any function imported from another member of the SCC normally shall have an explicit type signature in the module that exports it. (This doesn't seem a great burden, since type-signature for top-level functions/values are considered good practice anyway. Can anyone think of a use-case where cross-module type inference would be particularly useful?) Exception: imports may be given the {-# SOURCE #-} pragma. This fulfills two purposes: (1) It is a hint to a compiler that compiles modules separately that the current module should be compiled before the module being imported with {-# SOURCE #-}. Obviously, this can make optimization worse, since it's likely that SOURCE-imported functions won't be strictness-analyzed or inlined or anything; but that's the .hs-boot situation already. (And in principle even a compiler that likes separate compilation could break individual functions down into dependency order to compile them, adding another tradeoff point...) (2) If SOURCE pragmas "break the loop", then only functions that are actually imported with SOURCE must be given type signatures, even if module B then goes on to import module A wholesale: example: module A where {import {-#SOURCE#-} B (bf); ...} module B (module A, module B) where {import A; bf :: ...; ...} Since defining data types in logical places is an important use of cyclic imports, I propose not to require any extra annotation for them; the compiler will have to chase them down and understand them in loops (how else to do it?). However, there are some particular things to keep in mind regarding potential recompilation: (with a bit of a GHC bias) Changing any orphan instances in an SCC will force the whole thing to recompile (but what pluckiness, putting orphan instances *there*!) If a data type or newtype is imported without its constructors, then the RHS changing doesn't really force a recompile. I imagine this could work in GHC by, for each SOURCE import, storing the MD5 of the imported interface. Then when checking if you seriously have to recompile module A, you don't have to if none of those MD5s have changed and none of the non-SOURCE-imported modules' interface MD5s have either. In module cycles that aren't explicitly broken by SOURCEs, GHC (or any compiler) should just insert an implicit SOURCE for *all* cyclic imports (and possibly emit a warning) (unless the compiler wants to guess which SOURCES are better for optimization?). Presumably compilers that can do separate as well as non-separate compilation could take an optimization flag that tells them to compile cycles together as one piece rather than obeying the SOURCES for recompilation efficiency. so what does the compiler have to look at in a SOURCE-imported modules? In the case of the proposed SOURCE imports without hs-boot files, GHC would move from calculating one interface(md5) per module (or two interfaces in the case of .hs-boots), to one-per-import. I think this is, in principle, an advantage, although it does require more re-scanning when files are changed (only lexer/parser/renamer/module-chaser work). For example, I've found myself adding to .hs-boot files for the purpose of one module that SOURCE-imports the .hs-boot, which forces the recompile of another module that happens to depend on the .hs-boot too. To replicate the current GHC .hs-boot behavior (in which the hash-recalculation is shared among SOURCE-importers), one could replace a X.hs-boot file with an X_boot.hs file that contains: module X_boot (module X) where import {-# SOURCE #-} X (list of things exported by the old .hs-boot file) , and in other modules, replace import {-# SOURCE #-} X (....) with import X_boot (....) Taking .hs-boot docs as a guide [2], the compiler must look in SOURCE-imported modules for: - if an import list is given explicitly, `B (....)` not `B hiding (....)` or `B`, the export list only needs to be *checked* to make sure it exports the requested things, not remembered. Exception: data or class imported with `Name(..)` must remember exactly which constructors/members were exported. It's recommended to specify exactly what you're importing. - function type signatures - imports of functions, types, etc. If it's imported from outside the SCC, it doesn't need a type signature/whatever. If it's defined somewhere within the SCC, it generally does need a type signature. - fixity declarations, which only have to be imported in conjunction with the corresponding functions/constructors/whatever - data type / newtype declarations. When no constructor is imported, only the *kind* of the data type needs to be recorded, which might have to involve inference on the RHS (possibly involving more import chasing) if there aren't explicit kind annotations for *every* type parameter. - type synonym declarations. The whole thing has to be imported, including RHS. - classes. Including superclasses, class-method signatures, and default methods? Is there some way that GHC manages to allow not declaring all of these in .hs-boots? - instances, whether generated by 'deriving', 'deriving instance', or ordinary 'instance'; everything before the "where" clause of 'instance's is relevant. But an instance is only relevant if it's orphan, or if goes with a data or class that's also being imported. - the compiler-specific RULES pragmas probably follow similar mandates as above for instances and for the functions referenced in the RULE. [1] my official "complaint": http://hackage.haskell.org/trac/ghc/ticket/1409 [2] the GHC .hs-boot docs: http://www.haskell.org/ghc/docs/latest/html/users_guide/separate-compilation.html#mutual-recursion From igloo at earth.li Fri Aug 15 09:50:22 2008 From: igloo at earth.li (Ian Lynagh) Date: Fri Aug 15 09:49:33 2008 Subject: Mutually-recursive/cyclic module imports In-Reply-To: <48A58434.40607@charter.net> References: <48A58434.40607@charter.net> Message-ID: <20080815135022.GA8761@matrix.chaos.earth.li> On Fri, Aug 15, 2008 at 09:27:16AM -0400, Isaac Dupree wrote: > Haskell-98 specifies that module import cycles work > automatically with cross-module type inference. > > It has some weird interactions with defaulting and the > monomorphism restriction. In Haskell-prime we're planning > on removing artificial monomorphism, but defaulting will > still be necessary (and can still be set differently per > module). I'm not sure if defaulting actually makes this worse, but regardless, I think we should seriously consider removing defaulting anyway: http://hackage.haskell.org/trac/haskell-prime/wiki/Defaulting#Proposal4-removedefaulting Thanks Ian From isaacdupree at charter.net Fri Aug 15 11:35:37 2008 From: isaacdupree at charter.net (Isaac Dupree) Date: Fri Aug 15 11:34:52 2008 Subject: Mutually-recursive/cyclic module imports In-Reply-To: <20080815135022.GA8761@matrix.chaos.earth.li> References: <48A58434.40607@charter.net> <20080815135022.GA8761@matrix.chaos.earth.li> Message-ID: <48A5A249.4080906@charter.net> Ian Lynagh wrote: > I'm not sure if defaulting actually makes this worse, but regardless, I > think we should seriously consider removing defaulting anyway: > > http://hackage.haskell.org/trac/haskell-prime/wiki/Defaulting#Proposal4-removedefaulting Oh, actually, I agree with that proposal to remove defaulting. Maybe we should try implementing that and see how much things break. I imagine most uses can be solved by, if nothing else, adding local functions with more-constrained types, a bit similar to the (^) change. I noticed that depending on the resolution of http://hackage.haskell.org/trac/haskell-prime/wiki/KindInference , we might have a different sort of defaulting that examines exactly a whole module (which could also make it harder for my cyclic-module proposal to avoid recompilation? not sure) If we remove defaulting and the monomorphism restriction *and* don't add any other per-module semantics, then we get the module system out of the way of the semantics, which would make me very happy! There are a few GHC extensions that are still unfortunately per-module -- e.g. OverlappingInstances perhaps ought to be a notation or pragma on a class, rather than affecting all classes that happen to be defined in the module. (Pragmas aren't supposed to have an effect if they're not recognized; but sometimes people put OverlappingInstances on a class not because they're planning to make any such instances, but to allow users to define such instances; in which case the class and stock instances really can compile even in compilers that don't support overlapping instances) -Isaac From isaacdupree at charter.net Fri Aug 15 13:34:14 2008 From: isaacdupree at charter.net (Isaac Dupree) Date: Fri Aug 15 13:33:33 2008 Subject: empty case, empty definitions Message-ID: <48A5BE16.5050608@charter.net> There are two separate parts I propose, the second one I'm less sure of, but they're somewhat related. 1. Allow empty case, i.e. "case some_variable of { }" (GHC ticket [1]). This adds consistency, it always causes a pattern-match error, and it is a sensible way to look at all the cases of types with no constructors (recall EmptyDataDecls will probably be in Haskell' [4]) -- especially for automatic tools (or programmers familiar with dependent types; GADTs have some of these effects :-)). Presumably, any time that some_variable could be non-bottom, GHC will warn about the incomplete patterns :-). 2. When a type signature for a function is given, allow to not define any patterns (GHC ticket [2]). The result of calling the function is a pattern match failure (presumably the source-location given for the match failure will be the location of the type-signature). This can also be useful for calling functions before implementing them, helping the type-checker help me do incremental work (again, obviously produces a warning if the function could possibly be non-bottom). However I can think of a few things this (proposal 2) could interfere with: 2.i. Implementing a class method, will you get the default if that method has a default? Actually it turns out to be forbidden... class C n where foo :: n -> n foo = id instance C Int where foo :: Int -> Int --even if we define foo here too, it's an error: --misplaced type signature (perhaps thanks to improved --error messages, thanks simonpj! [5]). --Anyway, I think type signatures ought to be allowed here. I propose to allow type-signatures in instances, which must be equivalent to the signature in the class declaration after the class's signature is specialized to the particular instance type(s). If such a type-signature is found, allow the function to be defined as normal, which includes, if there are no patterns, an error if proposal 2 isn't adopted, and a pattern-match failure if proposal 2 is adopted. (also it turns out that pattern bindings aren't allowed in instances, such as {instance C Int where (foo) = negate}, but I can't say I have a compelling use-case for that!:-)) 2.ii. It could interfere with another feature request of mine (though I'm not sure I want it anymore) (GHC ticket [3]) : I'd like it to be allowed to give a (possibly more restrictive?) type signature at the top level of a module, to a function imported unqualified. Obviously in this case I don't want the function to be treated as pattern-match failure; but I think we can tell the difference because the name is in-scope in this case. Luckily there is no negative interaction with my related proposal to simply allow multiple equivalent type-signatures anywhere one of them is allowed in a declaration-list. So actually, in summary I can't really see anything wrong with proposal 2, especially if my proposal under 2.i. is adopted. [1] http://hackage.haskell.org/trac/ghc/ticket/2431 [2] http://hackage.haskell.org/trac/ghc/ticket/393 [3] http://hackage.haskell.org/trac/ghc/ticket/1404 [4] http://hackage.haskell.org/trac/haskell-prime/wiki/EmptyDataDecls [5] http://hackage.haskell.org/trac/ghc/ticket/1310 From isaacdupree at charter.net Fri Aug 15 14:00:14 2008 From: isaacdupree at charter.net (Isaac Dupree) Date: Fri Aug 15 13:59:17 2008 Subject: Mutually-recursive/cyclic module imports In-Reply-To: <48A58434.40607@charter.net> References: <48A58434.40607@charter.net> Message-ID: <48A5C42E.6070305@charter.net> Isaac Dupree wrote: > In the case of the proposed SOURCE imports without hs-boot files, GHC > would ... Ah, another difference from the .hs-boot system: in my proposal, when a file is imported with SOURCE and dependency chasing (e.g. of data-types) is done through its imports, it won't make a difference whether those imports have SOURCE pragmas; the compiler is in SOURCE-mode already, and will look at .hi files if there are any up-to-date ones available (e.g. the imported module isn't in the SCC / import loop), and otherwise will look at the source code (if it wanted, it could make some sort of .hi-boot out of it, I suppose). As opposed to the .hs-boot mechanism where .hs-boot files must choose carefully (and perhaps differently to the corresponding .hs file) whether their imports use SOURCE (they must if it's necessary to prevent loops, but must not if that module doesn't have a .hs-boot file that contains what's needed! But sometimes it doesn't make a difference, except for recompilation!) -Isaac From ndmitchell at gmail.com Fri Aug 15 14:41:21 2008 From: ndmitchell at gmail.com (Neil Mitchell) Date: Fri Aug 15 14:40:31 2008 Subject: empty case, empty definitions In-Reply-To: <48A5BE16.5050608@charter.net> References: <48A5BE16.5050608@charter.net> Message-ID: <404396ef0808151141r7e642b65v4c1ea671ddc22e0a@mail.gmail.com> Hi > 1. Allow empty case, i.e. "case some_variable of { }" (GHC ticket [1]). > This adds consistency, it always causes a pattern-match error, and it is a > sensible way to look at all the cases of types with no constructors (recall > EmptyDataDecls will probably be in Haskell' [4]) -- especially for automatic > tools (or programmers familiar with dependent types; GADTs have some of > these effects :-)). Presumably, any time that some_variable could be > non-bottom, GHC will warn about the incomplete patterns :-). Sounds good. Great for consistency and auto-generation of code. > 2. When a type signature for a function is given, allow to not define any > patterns (GHC ticket [2]). The result of calling the function is a pattern > match failure (presumably the source-location given for the match failure > will be the location of the type-signature). This can also be useful for > calling functions before implementing them, helping the type-checker help me > do incremental work (again, obviously produces a warning if the function > could possibly be non-bottom). Sounds bad. Consider: gray :: Color grey = newColor "#ccc" This fairly common style of bug now becomes perfectly valid Haskell, and if you always refer to "grey", you may never even have a clue that the bug is present. Thanks Neil From lennart at augustsson.net Fri Aug 15 15:42:41 2008 From: lennart at augustsson.net (Lennart Augustsson) Date: Fri Aug 15 15:41:50 2008 Subject: empty case, empty definitions In-Reply-To: <48A5BE16.5050608@charter.net> References: <48A5BE16.5050608@charter.net> Message-ID: I'm with Neil on this. Suggestion 1 is great, whereas suggestion 2 just makes it easier to make mistakes, and that's not what we want. On Fri, Aug 15, 2008 at 6:34 PM, Isaac Dupree wrote: > There are two separate parts I propose, the second one I'm less sure of, but > they're somewhat related. > > 1. Allow empty case, i.e. "case some_variable of { }" (GHC ticket [1]). > This adds consistency, it always causes a pattern-match error, and it is a > sensible way to look at all the cases of types with no constructors (recall > EmptyDataDecls will probably be in Haskell' [4]) -- especially for automatic > tools (or programmers familiar with dependent types; GADTs have some of > these effects :-)). Presumably, any time that some_variable could be > non-bottom, GHC will warn about the incomplete patterns :-). > > 2. When a type signature for a function is given, allow to not define any > patterns (GHC ticket [2]). The result of calling the function is a pattern > match failure (presumably the source-location given for the match failure > will be the location of the type-signature). This can also be useful for > calling functions before implementing them, helping the type-checker help me > do incremental work (again, obviously produces a warning if the function > could possibly be non-bottom). > > However I can think of a few things this (proposal 2) could interfere with: > 2.i. Implementing a class method, will you get the default if that method > has a default? Actually it turns out to be forbidden... > class C n where > foo :: n -> n > foo = id > instance C Int where > foo :: Int -> Int > --even if we define foo here too, it's an error: > --misplaced type signature (perhaps thanks to improved > --error messages, thanks simonpj! [5]). > --Anyway, I think type signatures ought to be allowed here. > I propose to allow type-signatures in instances, which must be equivalent to > the signature in the class declaration after the class's signature is > specialized to the particular instance type(s). If such a type-signature is > found, allow the function to be defined as normal, which includes, if there > are no patterns, an error if proposal 2 isn't adopted, and a pattern-match > failure if proposal 2 is adopted. > (also it turns out that pattern bindings aren't allowed in instances, such > as {instance C Int where (foo) = negate}, but I can't say I have a > compelling use-case for that!:-)) > > 2.ii. It could interfere with another feature request of mine (though I'm > not sure I want it anymore) (GHC ticket [3]) : I'd like it to be allowed to > give a (possibly more restrictive?) type signature at the top level of a > module, to a function imported unqualified. Obviously in this case I don't > want the function to be treated as pattern-match failure; but I think we can > tell the difference because the name is in-scope in this case. Luckily there > is no negative interaction with my related proposal to simply allow multiple > equivalent type-signatures anywhere one of them is allowed in a > declaration-list. > > So actually, in summary I can't really see anything wrong with proposal 2, > especially if my proposal under 2.i. is adopted. > > [1] http://hackage.haskell.org/trac/ghc/ticket/2431 > [2] http://hackage.haskell.org/trac/ghc/ticket/393 > [3] http://hackage.haskell.org/trac/ghc/ticket/1404 > [4] http://hackage.haskell.org/trac/haskell-prime/wiki/EmptyDataDecls > [5] http://hackage.haskell.org/trac/ghc/ticket/1310 > _______________________________________________ > Haskell-prime mailing list > Haskell-prime@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-prime > From isaacdupree at charter.net Fri Aug 15 18:16:38 2008 From: isaacdupree at charter.net (Isaac Dupree) Date: Fri Aug 15 18:16:02 2008 Subject: empty case, empty definitions In-Reply-To: <404396ef0808151141r7e642b65v4c1ea671ddc22e0a@mail.gmail.com> References: <48A5BE16.5050608@charter.net> <404396ef0808151141r7e642b65v4c1ea671ddc22e0a@mail.gmail.com> Message-ID: <48A60046.8010806@charter.net> Neil Mitchell wrote: > Sounds bad. Consider: > > gray :: Color > grey = newColor "#ccc" My rationale for typoes not being a problem (both your example, and the one Malcolm Wallace posted to the "empty case" ticket) is that GHC will give you a warning anyway (and that warning should be on by default). Should we be worrying about the situation being worse for other compilers that don't have good warning-systems (e.g. I don't think Hugs has warnings at all)? -Isaac From cmb21 at kent.ac.uk Sat Aug 16 07:38:12 2008 From: cmb21 at kent.ac.uk (C.M.Brown) Date: Sat Aug 16 07:37:21 2008 Subject: empty case, empty definitions In-Reply-To: <404396ef0808151141r7e642b65v4c1ea671ddc22e0a@mail.gmail.com> References: <48A5BE16.5050608@charter.net> <404396ef0808151141r7e642b65v4c1ea671ddc22e0a@mail.gmail.com> Message-ID: Hi, > Sounds bad. Consider: > > gray :: Color > grey = newColor "#ccc" > > This fairly common style of bug now becomes perfectly valid Haskell, > and if you always refer to "grey", you may never even have a clue that > the bug is present. I think the compiler should certainly give a warning that no equations are defined for a definition. It would be impossible to check for user typos! :) It does make me beg the question though: why do we want to define data types without any constructors? If we do opt for empty data declarations, then both general pattern matching and case expressions need to be able to cope with it for consistency. Regards, Chris From duncan.coutts at worc.ox.ac.uk Sat Aug 16 13:11:29 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Sat Aug 16 13:11:10 2008 Subject: Mutually-recursive/cyclic module imports In-Reply-To: <48A58434.40607@charter.net> References: <48A58434.40607@charter.net> Message-ID: <1218906689.13639.104.camel@localhost> On Fri, 2008-08-15 at 09:27 -0400, Isaac Dupree wrote: > Haskell-98 specifies that module import cycles work > automatically with cross-module type inference. [...] I'd very much like you to consider in any proposal like this how easy it is to implement module dependency chasing. If the dependency chaser has to know too much about Haskell it makes it very difficult for tools like Cabal or hmake and we could be stuck with only ghc --make or ghc -M. Our plan with Cabal is to do dependency chasing which would enable incremental and parallel rebuilds. I'm not saying it's a problem with your proposal, I'd just like it to be taken into account. For example do dependency chasers need to grok just import lines and {-# SOURCE -#} pragmas or do they need to calculate fixpoints. Duncan From isaacdupree at charter.net Sat Aug 16 13:51:10 2008 From: isaacdupree at charter.net (Isaac Dupree) Date: Sat Aug 16 13:50:15 2008 Subject: Mutually-recursive/cyclic module imports In-Reply-To: <1218906689.13639.104.camel@localhost> References: <48A58434.40607@charter.net> <1218906689.13639.104.camel@localhost> Message-ID: <48A7138E.50806@charter.net> Duncan Coutts wrote: > [...] > > I'm not saying it's a problem with your proposal, I'd just like it to be > taken into account. For example do dependency chasers need to grok just > import lines and {-# SOURCE -#} pragmas or do they need to calculate > fixpoints. Good point. What does the dependency chaser need to figure out? - exactly what dependency order files must be compiled (e.g., ghc -c) ? - what files (e.g., .hi) are needed to be findable by the e.g. (ghc -c) ? - recompilation avoidance? -Isaac From duncan.coutts at worc.ox.ac.uk Sat Aug 16 20:28:09 2008 From: duncan.coutts at worc.ox.ac.uk (Duncan Coutts) Date: Sun Aug 17 08:35:34 2008 Subject: Mutually-recursive/cyclic module imports In-Reply-To: <48A7138E.50806@charter.net> References: <48A58434.40607@charter.net> <1218906689.13639.104.camel@localhost> <48A7138E.50806@charter.net> Message-ID: <1218932889.13639.111.camel@localhost> On Sat, 2008-08-16 at 13:51 -0400, Isaac Dupree wrote: > Duncan Coutts wrote: > > [...] > > > > I'm not saying it's a problem with your proposal, I'd just like it to be > > taken into account. For example do dependency chasers need to grok just > > import lines and {-# SOURCE -#} pragmas or do they need to calculate > > fixpoints. > > Good point. What does the dependency chaser need to figure out? > - exactly what dependency order files must be compiled > (e.g., ghc -c) ? > - what files (e.g., .hi) are needed to be findable by the > e.g. (ghc -c) ? > - recompilation avoidance? It needs to work out which files the compiler will read when it compiles that module. So currently, I think we just have to read a single .hs file and discover what modules it imports. We then can map those to .hi or .hs-boot files in one of various search dirs or packages. We also need to look at {#- SOURCE #-} import pragmas since that means we look for a different file to ordinary imports. Calculating dependency order and recompilation avoidance are things the dep program has to do itself anyway. The basics is just working out what things compiling a .hs file depends on. Obviously it's somewhat dependent on the Haskell implementation. Duncan From ramin.honary at gmail.com Sun Aug 17 10:45:21 2008 From: ramin.honary at gmail.com (Ramin) Date: Sun Aug 17 10:44:18 2008 Subject: New language feature: array-types Message-ID: <48A83981.6050605@gmail.com> I am new to both the Haskell language, and to this group, but I have recently become obsessed with the mathematical precision of Haskell, and I want to help improve it, if members of the group could bear with me. The one thing I dislike about the Haskell language is it's treatment of arrays. I don't understand how things work internally to the system, and perhaps array-manipulating code can be efficiently optimized, but I would prefer to have a language feature for explicitly creating and modifying arrays in a way that does not require the entire array be copied on every update. My idea is this: a fixed-width mutable array can be declared as a type, much like a list, but can be evaluated more like a bitwise-integer operation. -- in an array of A's set the 5th item in the array with an "initial-A" value changeArrayFunc :: a^10 -> a^10 changeArrayFunc ar = (ar:5 <- initA) -- returns an array which is that same as the old array, except the value at index 5 is different. -- select the 5th item of an array getArrayItemFunc :: a^10 -> a getArrayItemFunc ar = (ar:5) Here, I use the caret (^) operator to indicate an "array-10" type. I used caret because it is typically used for exponents -- as in, if your data type has N possible values, an array with 10 units would have N^10 possible values. Then, I use the colon (:) operator to do indexing. I've seen the various proposals for indexing operators and honestly I don't care which operator is chosen; I just use the colon operator because that is how lists are treated in pattern matching. The important thing is that an array-type exists, that way all bounds-checking can be done at compile-time by the typing system. Using arrays of different lengths would result in a compile-time error: Obviously, this will increase the complexity of the typing system. data Something = Array Int^10 change5Array :: Int^5 -> Int^5 change5Array ar = ((ar:4) <- 0) -- Something has an array of type Int^10, but calls "change5Array" which expects an Int^5 badFunc :: Something -> Int badFunc (Array x) = (change5Array x) -- COMPILE-TIME ERROR -- Arrays.hs:8:16: -- Couldn't match expected type `Int^5' against inferred type `Int^10' -- In the first argument of `change5array', namely `x' -- In the expression: (change5array x) -- In the definition of `badFunc': -- badFunc (Array x) = (change5Array x) ...or something like that. An efficient implementation of array access would make Haskell very useful for a much wider variety of computationally intensive applications. Haskel could be used to efficiently model memory access, provided that the interpreter knew not to "copy" arrays upon update, but simply to update a value within an array. If arrays were a language feature of Haskell, then this optimization could be guaranteed. If anyone takes the time to consider this idea, or to tell my why this isn't necessary, I would be most greatful. -- Ramin Honary From lennart at augustsson.net Sun Aug 17 16:19:29 2008 From: lennart at augustsson.net (Lennart Augustsson) Date: Sun Aug 17 16:18:34 2008 Subject: New language feature: array-types In-Reply-To: <48A83981.6050605@gmail.com> References: <48A83981.6050605@gmail.com> Message-ID: You can code array types with static bounds with the existing Haskell type system. On Sun, Aug 17, 2008 at 3:45 PM, Ramin wrote: > I am new to both the Haskell language, and to this group, but I have > recently become obsessed with the mathematical precision of Haskell, and I > want to help improve it, if members of the group could bear with me. > > The one thing I dislike about the Haskell language is it's treatment of > arrays. I don't understand how things work internally to the system, and > perhaps array-manipulating code can be efficiently optimized, but I would > prefer to have a language feature for explicitly creating and modifying > arrays in a way that does not require the entire array be copied on every > update. > > My idea is this: a fixed-width mutable array can be declared as a type, much > like a list, but can be evaluated more like a bitwise-integer operation. > > -- in an array of A's set the 5th item in the array with an "initial-A" > value > changeArrayFunc :: a^10 -> a^10 > changeArrayFunc ar = (ar:5 <- initA) -- returns an array which is that > same as the old array, except the value at index 5 is different. > > -- select the 5th item of an array > getArrayItemFunc :: a^10 -> a > getArrayItemFunc ar = (ar:5) > > Here, I use the caret (^) operator to indicate an "array-10" type. I used > caret because it is typically used for exponents -- as in, if your data type > has N possible values, an array with 10 units would have N^10 possible > values. Then, I use the colon (:) operator to do indexing. I've seen the > various proposals for indexing operators and honestly I don't care which > operator is chosen; I just use the colon operator because that is how lists > are treated in pattern matching. > > The important thing is that an array-type exists, that way all > bounds-checking can be done at compile-time by the typing system. Using > arrays of different lengths would result in a compile-time error: Obviously, > this will increase the complexity of the typing system. > > data Something = Array Int^10 > > change5Array :: Int^5 -> Int^5 > change5Array ar = ((ar:4) <- 0) > > -- Something has an array of type Int^10, but calls "change5Array" which > expects an Int^5 > badFunc :: Something -> Int > badFunc (Array x) = (change5Array x) > > -- COMPILE-TIME ERROR > -- Arrays.hs:8:16: > -- Couldn't match expected type `Int^5' against inferred type `Int^10' > -- In the first argument of `change5array', namely `x' > -- In the expression: (change5array x) > -- In the definition of `badFunc': > -- badFunc (Array x) = (change5Array x) > > ...or something like that. > > An efficient implementation of array access would make Haskell very useful > for a much wider variety of computationally intensive applications. Haskel > could be used to efficiently model memory access, provided that the > interpreter knew not to "copy" arrays upon update, but simply to update a > value within an array. If arrays were a language feature of Haskell, then > this optimization could be guaranteed. > > If anyone takes the time to consider this idea, or to tell my why this isn't > necessary, I would be most greatful. > > -- Ramin Honary > > _______________________________________________ > Haskell-prime mailing list > Haskell-prime@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-prime > From isaacdupree at charter.net Sun Aug 17 19:23:41 2008 From: isaacdupree at charter.net (Isaac Dupree) Date: Sun Aug 17 19:22:51 2008 Subject: Mutually-recursive/cyclic module imports In-Reply-To: <48A7138E.50806@charter.net> References: <48A58434.40607@charter.net> <1218906689.13639.104.camel@localhost> <48A7138E.50806@charter.net> Message-ID: <48A8B2FD.4020405@charter.net> Isaac Dupree wrote: > Duncan Coutts wrote: >> [...] >> >> I'm not saying it's a problem with your proposal, I'd just like it to be >> taken into account. For example do dependency chasers need to grok just >> import lines and {-# SOURCE -#} pragmas or do they need to calculate >> fixpoints. Actually, good point, Duncan, that got me thinking about what we need in order to obviously not to lose much/any of the .hs-boot efficiency. (warning: another long post ahead, although the latter half of it is just an example from GHC's source) [and I re-read my post and wasn't sure about a few things, but maybe better to get feedback first -- please tell me if I'm being too verbose somewhere, too] Let's look at the total imports of a .hs and its .hs-boot, as they currently are for GHC. Either can be non-SOURCE imports (let's call them NOSOURCE), SOURCE imports, or not importing that. .hs:NOSOURCE, .hs-boot:NOSOURCE : okay .hs:NOSOURCE, .hs-boot:SOURCE : okay .hs:NOSOURCE, .hs-boot:not-imported : okay .hs:SOURCE, .hs-boot:NOSOURCE : bad, if the .hs needs SOURCE, then probably so does the .hs-boot .hs:SOURCE, .hs-boot:SOURCE : okay .hs:SOURCE, .hs-boot:not-imported : okay - the .hs-boot importing a module that the .hs doesn't is invalid, or at least useless [actually, see later example -- there may be reasons for this, but in that case, it doesn't hurt to also import the module in the .hs (assuming there's no syntactic/maintenance burden), and it provides better automatic error-checking to do so] Given the limited amount of information a .hs-boot file (or SOURCE-imported file, in my scheme) needs for being a boot-file, there is no advantage to import the modules it depends on as NOSOURCE. The compiler just has to be clever enough to ignore imports of functions that it can't find out the type of. Also, currently using SOURCE requires the imported module to have a .hs-boot. But it should work fine to look for a .hi and use that in the absence of .hi-boot, because it has strictly a superset of the information (so that my statement that "SOURCE is superior to NOSOURCE when it works" can be truer, for the sake of demonstration). [oops! I was wrong, it may need to NOSOURCE-import on occasion to find out a function's type - more on that in a later post?] Now, since the .hs-boot SOURCE vs NOSOURCE has been collapsed, I think we can move mostly-all .hs-boot info into the .hs file. If the .hs-boot file had imported something, the corresponding import in the .hs is imported with {-#SOURCE_FOLLOW#-} (in addition to {-#SOURCE#-} or {-#NOSOURCE#-}); otherwise it's imported with {-#SOURCE_NOFOLLOW#-} (ditto). For demonstration, I'll assume that all imports are annotated this way, with two bits of information. Presumably all imports that aren't part of an import loop are NOSOURCE (which includes all cross-package imports). Now let's look at the dependency chaser. NOSOURCE imports must not form a loop. They form dependency chains as normal. SOURCE imports depend on either a .hi or a .hi-boot for the imported module. When a X.hi-boot is demanded: only SOURCE_FOLLOW imports are dependency-chased from X.hs, through any .hs modules that don't already have a .hi or .hi-boot. In the case where .hs-boots worked, this *can* avoid cycles. If this SOURCE_FOLLOW dependency DAG doesn't have any cycles, then it should be as simple as calling (the fictional) `ghc -source X.hs` to produce X.hi. If there are cycles, and it is sometimes necessary*, GHC needs to be slightly smarter and be able to produce all the .hi-boot files at once from any graph SCCs (loops) that prevent it from being a DAG (e.g., `ghc -source X.hs Y.hs` to produce X.hi-boot and Y.hi-boot). Note that it doesn't need to be particularly smart here -- e.g., no type inference is done. *necessary loops: example 1, the data/declarations literally loop: module X1 where { import Y1(Y); data X a = End a | Both Y Y; } module Y1 where { import X1(X); data Y = Only (X (Maybe Y)); } (or kind annotations could be required for these loops in general, e.g. data X (a :: *) = ...) [hmm, in this case actually all we need is the data left-hand-side, so we could do this in two stages. But that wouldn't work out so well if their RHSs contained {-#UNPACK#-}!SomeNewtypeForInt where SomeNewtypeForInt was from the other module. But that's an optimization that it might be okay not to do, as long as it was consistently not done both for .hi-boot and .hi/.o; and it could perhaps be doable] example 2, there are just too many back-and-forths: module X2 where { import Y2(Yb); data Xa = Xa; data Xc = Xc Yb; } module Y2 where { import X2(Xa,Xc); data Yb = Yb Xa; data Yd = Yd Xc; } This second one "could" also be accomplished if multiple different .hs-boots were allowed per .hs, although it doesn't seem worth the annotation!! such as using SOURCE_FOLLOW[0] or [1], [2]... I'm not even going to try to write that! [oh wait, SOURCE[0->1] = SOURCE, SOURCE[1->1] = SOURCE_FOLLOW, SOURCE[1->null] = SOURCE_NOFOLLOW, maybe something can be done like that, more complicated in one way but perhaps a bit sounder in another] Now, SOURCE_NOFOLLOW is a bit of a hack, for a couple reasons: - instances (especially orphans, and especially overlapping instances) may not always be imported when they should be. - There may be some information that could be SOURCE-imported from the module if all its imports were SOURCE_FOLLOW, but not enough information was imported that way due solely to SOURCE_NOFOLLOW. That's probably okay though; after all, the presence/absence of explicit type signatures should have the same effect. Any information that the shallow -hi-boot-making search can't figure out, just doesn't go into the .hi-boot (possibly leading to erroring later... perhaps the .hi-boot could store info saying which information existed but it couldn't figure out, to enhance error messages if that info is ever demanded.) Obviously, Template Haskell can only be run if NOSOURCE-imported from another module. The stupid dependency chaser (the most complicated thing it does besides parsing import statements is computing graph SCCs) will, of course, still find a few more changed dependencies than really need to be recompiled; as always, this is where GHC's fancier recompilation checking will come into effect. Some compilers might benefit from (require?) explicit import or export lists in some places... also I wonder if perhaps items in export lists should be markable as whether they're exported to SOURCE-importers (although it doesn't seem necessary) Obviously, annotating every import with both [NO]SOURCE and SOURCE_[NO]FOLLOW is unreasonable! So let's look at inferring them. Any import that's not explicitly annotated with [NO]SOURCE can default to NOSOURCE if it's not part of an import-loop, or SOURCE if it is. NOSOURCE is allowed as a pragma here as well as SOURCE. That is, the dependency chaser assumes NOSOURCE, and if it finds a loop of imports that aren't explicitly SOURCE imports, it converts all that aren't *explicitly* NOSOURCE into SOURCE imports (if they're all explicitly NOSOURCE, it's an error). Since SOURCE_NOFOLLOW can easily break things, it really shouldn't be the default. (And there's never any need of it for imports of modules that aren't part of the current module cycle). However, we really don't want to have to specify it on all imports within the loop -- .hs-boots manage to only specify for modules that *are* needed. I suggest that SOURCE_[NO]FOLLOW be allowed as a top-level pragma that says all (following?) imports are annotated with that it they're not explicitly annotated SOURCE_[NO]FOLLOW. For example, let's take some random file from GHC's source that has a [l]hs-boot file: compiler/deSugar/DsExpr current lhs-boot: \begin{code} module DsExpr where import HsSyn ( HsExpr, LHsExpr, HsLocalBinds ) import Var ( Id ) import DsMonad ( DsM ) import CoreSyn ( CoreExpr ) dsExpr :: HsExpr Id -> DsM CoreExpr dsLExpr :: LHsExpr Id -> DsM CoreExpr dsLocalBinds :: HsLocalBinds Id -> CoreExpr -> DsM CoreExpr \end{code} current lhs: lots of imports, it will become obvious proposed new lhs, just like old lhs but with a few pragmas inserted: \begin{code} -- ... module DsExpr ( dsExpr, dsLExpr, dsLocalBinds, dsValBinds, dsLit ) where {-# SOURCE_NOFOLLOW #-} #include "HsVersions.h" import Match import MatchLit import DsBinds import DsGRHSs import DsListComp import DsUtils import DsArrows import {-# SOURCE_FOLLOW #-} DsMonad import Name #ifdef GHCI import PrelNames -- Template Haskell stuff iff bootstrapped import DsMeta #endif import {-# SOURCE_FOLLOW #-} HsSyn import TcHsSyn -- NB: The desugarer, which straddles the source and Core worlds, sometimes -- needs to see source types import TcType import Type import {-# SOURCE_FOLLOW #-} CoreSyn import CoreUtils import DynFlags import CostCentre -- hmm, actually Var was not imported by the lhs, -- only Id (which imports Var) ! It looks okay to -- just annotate the Id import here: import {-# SOURCE_FOLLOW #-} Id -- Are there times where this would ever -- be a terrible problem? Well, we could have -- added a line --import {-# SOURCE_FOLLOW #-} Var ( Id ) -- instead, which would not hurt much. -- (if Var.Id were a different type than Id.Id, -- compiling this DsExpr module would give a -- simple ambiguity error, no risk of -- hs vs. hs-boot inconsistency) import PrelInfo import DataCon import TysWiredIn import BasicTypes import PrelNames import SrcLoc import Util import Bag import Outputable import FastString \end{code} ... -Isaac From lennart at augustsson.net Mon Aug 18 03:29:25 2008 From: lennart at augustsson.net (Lennart Augustsson) Date: Mon Aug 18 03:28:27 2008 Subject: New language feature: array-types In-Reply-To: <2a43aa3d0808180016l38888bebvf3e6faa6e1c49f8b@mail.gmail.com> References: <48A83981.6050605@gmail.com> <2a43aa3d0808180016l38888bebvf3e6faa6e1c49f8b@mail.gmail.com> Message-ID: Have a look at http://okmij.org/ftp/Haskell/eliminating-array-bound-check.lhs On Mon, Aug 18, 2008 at 8:16 AM, Ramin Honary wrote: > Really? Where the bounds checking can be done at compile-time? (Excluding > cases where the array object is accessed by a constant value set at > run-time...) > I have seen an array type, but not a "bounded array" type where the size of > the array is given in the type definition, (as I explained with the example > in my last e-mail.) > Then is there any work being done in Haskell prime to improve the efficiency > of updating arrays. From the wiki pages I have read, it is impossible to > make any array that updates faster than O(n). > (Also, should I send replies to the haskell wiki?) From lennart at augustsson.net Mon Aug 18 03:31:46 2008 From: lennart at augustsson.net (Lennart Augustsson) Date: Mon Aug 18 03:30:47 2008 Subject: New language feature: array-types In-Reply-To: <2a43aa3d0808180016l38888bebvf3e6faa6e1c49f8b@mail.gmail.com> References: <48A83981.6050605@gmail.com> <2a43aa3d0808180016l38888bebvf3e6faa6e1c49f8b@mail.gmail.com> Message-ID: As for array updating, there are many ways to improve the O(n) update. You can use a tree representation and get O(log n) for all operations. You can use the array single threaded in the ST monad and get all the usual array operation complexities. Etc. etc. On Mon, Aug 18, 2008 at 8:16 AM, Ramin Honary wrote: > Really? Where the bounds checking can be done at compile-time? (Excluding > cases where the array object is accessed by a constant value set at > run-time...) > I have seen an array type, but not a "bounded array" type where the size of > the array is given in the type definition, (as I explained with the example > in my last e-mail.) > Then is there any work being done in Haskell prime to improve the efficiency > of updating arrays. From the wiki pages I have read, it is impossible to > make any array that updates faster than O(n). > (Also, should I send replies to the haskell wiki?) From dons at galois.com Mon Aug 18 04:19:57 2008 From: dons at galois.com (Don Stewart) Date: Mon Aug 18 04:19:05 2008 Subject: New language feature: array-types In-Reply-To: References: <48A83981.6050605@gmail.com> <2a43aa3d0808180016l38888bebvf3e6faa6e1c49f8b@mail.gmail.com> Message-ID: <20080818081957.GA29584@scytale.galois.com> lennart: > As for array updating, there are many ways to improve the O(n) update. > You can use a tree representation and get O(log n) for all operations. > You can use the array single threaded in the ST monad and get all the > usual array operation complexities. Or use a history/transaction list to average out the copy cost, or use fusion to minimise the updates required. Making pure arrays efficient is a lot of fun, but it's a library issue, not a language one, necessarily. -- Don From ramin.honary at gmail.com Mon Aug 18 09:59:40 2008 From: ramin.honary at gmail.com (Ramin) Date: Mon Aug 18 09:58:33 2008 Subject: New language feature: array-types In-Reply-To: <20080818081957.GA29584@scytale.galois.com> References: <48A83981.6050605@gmail.com> <2a43aa3d0808180016l38888bebvf3e6faa6e1c49f8b@mail.gmail.com> <20080818081957.GA29584@scytale.galois.com> Message-ID: <48A9804C.60800@gmail.com> Well, in C/C++, and most any other imperative languages (as you probably know) is O(1) for both reading and updating arrays. Until Haskell can do this, I don't think Haskell is a viable option for operating system design, computer graphics, or embedded applications. Thats a shame because Haskell can do pretty much anything else, and much better/safer than imperative languages -- at least until we get CPU's specially designed to run Haskell code at the machine-level. I was hoping that Haskell-prime would address this. I could be wrong but it really comes down to whether or not the Haskell code be optimized to use arrays in the same way that a C program would. And this optimization could be explicitly declared by the programmer if the language allowed for it, right? Don Stewart wrote: > lennart: > >> As for array updating, there are many ways to improve the O(n) update. >> You can use a tree representation and get O(log n) for all operations. >> You can use the array single threaded in the ST monad and get all the >> usual array operation complexities. >> > > Or use a history/transaction list to average out the copy cost, or use > fusion to minimise the updates required. > > Making pure arrays efficient is a lot of fun, but it's a library issue, > not a language one, necessarily. > > -- Don > From dons at galois.com Mon Aug 18 11:41:32 2008 From: dons at galois.com (Don Stewart) Date: Mon Aug 18 11:40:32 2008 Subject: New language feature: array-types In-Reply-To: <48A9804C.60800@gmail.com> References: <48A83981.6050605@gmail.com> <2a43aa3d0808180016l38888bebvf3e6faa6e1c49f8b@mail.gmail.com> <20080818081957.GA29584@scytale.galois.com> <48A9804C.60800@gmail.com> Message-ID: <20080818154132.GA30634@scytale.galois.com> ramin.honary: > Well, in C/C++, and most any other imperative languages (as you probably > know) is O(1) for both reading and updating arrays. Until Haskell can do > this, The standard array types provide O(1) reading and updating, and have done so for the last 15 years. See Data.Array.MArray and Data.Array.ST http://haskell.org/ghc/docs/latest/html/libraries/array/Data-Array-MArray.html http://haskell.org/ghc/docs/latest/html/libraries/array/Data-Array-ST.html -- Don From cdsmith at gmail.com Wed Aug 20 17:16:50 2008 From: cdsmith at gmail.com (Chris Smith) Date: Wed Aug 20 17:48:56 2008 Subject: New language feature: array-types References: <48A83981.6050605@gmail.com> <2a43aa3d0808180016l38888bebvf3e6faa6e1c49f8b@mail.gmail.com> <20080818081957.GA29584@scytale.galois.com> <48A9804C.60800@gmail.com> Message-ID: Ramin wrote: > Well, in C/C++, and most any other imperative languages (as you probably > know) is O(1) for both reading and updating arrays. Until Haskell can do > this, I don't think Haskell is a viable option for operating system > design, computer graphics, or embedded applications. There are two issues here, which I think were unnecessarily tangled together in your original post. First: you proposed arrays whose size is known and accesses checked at compile-time. Second: you proposed making arrays mutable so as to recover the expected time bounds for operations. The first is possible, but considerably more complex that your original post made it sound. The second, though, is already there in Haskell as it stands. Just use STArray, for example. The big difference is that with STArray, the side effects of your code, and the fact that the original copy of the array is destroyed, are acknowledged by the type system. You tried to give changeArrayFunc a type of a^10 -> a^10: a type that implies it computes a new value, but leaves the original array alone. Unless it does some analysis, that could be arbitrarily complex in general, to prove I never use the original array again, the compiler can *not* generate destructive code in this case. The ST monad *is* the general answer to this problem, so you should just use STArray, and you get what you want. -- Chris Smith From cdsmith at gmail.com Thu Aug 21 09:59:20 2008 From: cdsmith at gmail.com (Chris Smith) Date: Thu Aug 21 09:58:19 2008 Subject: Mutually-recursive/cyclic module imports References: <48A58434.40607@charter.net> <20080815135022.GA8761@matrix.chaos.earth.li> Message-ID: Ian Lynagh wrote: > http://hackage.haskell.org/trac/haskell-prime/wiki/Defaulting#Proposal4- removedefaulting > Here's a late response to the comments on that wiki page. It seems, to me, an extremely bad idea to remove defaulting *and* make that proposed change to (^) at the same time. Code can currently depend on defaulting, and if it does, then integers most likely default to the Integer type, not Int. If you just remove defaulting, then that code fails to compile, which is fine. If you remove defaulting and change (^) at the same time, then that code compiles but means something different, which is definitely not fine. It may initially seem like there's no problem, since no one would possibly want to use a number anywhere near 2 billion as an exponent for (^). The problem here is that if one is allowing types to be inferred (which is certainly true, if we're worried about defaulting), then that use of the horribly unsafe Int type can propogate through the code. Do I use this number as an exponent, and then also add it to x somewhere else? Then x is an Int as well. Then maybe I calculate (x * y) somewhere else? Okay, now y is an Int. And perhaps y is added to z? So, z is an Int. But maybe z overflows... and now a nasty bug, a numeric overflow in z, was introduced without changing my code, without a warning, by the change to the type of (^) which is used four functions away. -- Chris From john at repetae.net Tue Aug 26 19:31:33 2008 From: john at repetae.net (John Meacham) Date: Tue Aug 26 19:30:06 2008 Subject: Mutually-recursive/cyclic module imports In-Reply-To: <48A58434.40607@charter.net> References: <48A58434.40607@charter.net> Message-ID: <20080826233133.GF15616@sliver.repetae.net> A very good paper on what it actually "means" to have recursive modules is presented in this paper: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.6.8816 in jhc, it is implemented in the module FrontEnd.Exports. I think the main rule that should be followed is that the name resolutions generated with the hs-boot files should _never_ conflict with the name resolution that would happen were the compiler to support full mutually recursive modules as described in the above paper. That way tools that need them can use the hs-boot files and tools that don't can ignore them and still be guarenteed to get the same results. This doesn't necessarily mean that implementations need to support full recursive modules, just that where they do, they don't conflict with a full implementation. In terms of dependency chasing, why not have the tools do it? ghc has the '-M' option, though, it its current form it isn't very convinient to use (I always have to postprocess its output), it shouldn't be too tough to beef it up a little. though, I would love it if haddock performed full recursive inter-module name resolution. If anyone wants to use jhc's code to achieve these goals, I will hapilly relicense any parts wanted under the MIT/2 clause bsd or ghc license upon asking. John -- John Meacham - ?repetae.net?john? From john at repetae.net Tue Aug 26 19:33:48 2008 From: john at repetae.net (John Meacham) Date: Tue Aug 26 19:32:21 2008 Subject: Mutually-recursive/cyclic module imports In-Reply-To: <20080826233133.GF15616@sliver.repetae.net> References: <48A58434.40607@charter.net> <20080826233133.GF15616@sliver.repetae.net> Message-ID: <20080826233348.GG15616@sliver.repetae.net> On Tue, Aug 26, 2008 at 04:31:33PM -0700, John Meacham wrote: > http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.6.8816 Doh! wrong paper. http://portal.acm.org/citation.cfm?id=581690.581692 anyone have a free link? John -- John Meacham - ?repetae.net?john?