From twanvl at gmail.com Thu Mar 8 15:00:51 2007 From: twanvl at gmail.com (Twan van Laarhoven) Date: Thu Mar 8 14:53:46 2007 Subject: Deriving Functor Message-ID: <45F06B73.3060802@gmail.com> Hello, I would like to propose to add a way to automatically derive instances of Functor. From looking at existing code, it seems that almost all Functor instances I see are derivable using the algorithm presented here, resulting in less boilerplate code. This proposal is compatible with Haskell98 (and therefore also with Haskell'). Let's start with an example. The following declaration: > data Tree a = Leaf | Node (Tree a) a (Tree a) > deriving Functor would generate the following Functor instance: > instance Functor Tree where > fmap f (Leaf ) = Leaf > fmap f (Node l a r) = Node (fmap f l) (f a) (fmap f r) To be able to derive Functor in a general way, more classes are needed to support functors over other parameters: > class Functor2 f where fmap2 :: (a -> b) -> f a x -> f b x > class Functor3 f where fmap3 :: (a -> b) -> f a x y -> f b x y > -- etc. Provided instances would be: > instance Functor ((,) a) -- currently in Control.Monad.Instances > instance Functor2 (,) > instance Functor ((,,) a b) > instance Functor2 ((,,) a) > instance Functor3 (,,) > instance Functor ((,,,) a b c) > instance Functor2 ((,,,) a b) > instance Functor3 ((,,,) a) > instance Functor4 (,,,) > -- etc. Also, a contravariant functors can come up: > class CoFunctor f where cofmap :: (a -> b) -> f b -> f a > class CoFunctor2 f where cofmap2 :: (a -> b) -> f b x -> f a x > -- etc. Now, to derive functor for a data type > data D a = C1 u v w | C2 x y z | ... The instance would be: > instance Functor D where > fmap f d = case d of > C1 q r s -> C1 (fmap_ f q) (fmap_ f r) (fmap_ f s) > C2 t u v -> C1 (fmap_ f t) (fmap_ f u) (fmap_ f v) > ... With the appropriate context. Here fmap_ is the deriving scheme to derive a functor over type b, parameterized by the type variable a: > fmap_ f = f > fmap_ f = id -- b does not contain a > fmap_ f = fmap (fmap_ f) > fmap_ f = fmap2 (fmap_ f) . fmap (fmap_ f) > --etc. > fmap_ y> f = \u -> fmap_ f . u . cofmap_ f > > cofmap_ f = id -- b does not contain a > cofmap_ f = cofmap (fmap_ f) > cofmap_ f = cofmap2 (fmap_ f) . cofmap (fmap_ f) > --etc. > cofmap_ y> f = \u -> cofmap_ f . u . fmap_ f Before type checking to determine the required instances, the transformations fmapN id --> id cofmapN id --> id must be applied. Otherwise unnecessary instances will be required, see the State example below. Here are some examples of the deriving scheme. The derived instances are exactly as you would expect: > data Tree a = Leaf | Node (Tree a) a (Tree a) The instance is derived as: > fmap f d = case d of > Leaf -> Leaf > Node a b c -> Node (fmap_ f a) (fmap_ f b) > (fmap_ f c) > = Node (fmap (fmap_ f) a) (f b) > (fmap (fmap_ f) c) > = Node (fmap f a) (f b) (fmap f c) It also works for things like monad transformers: > newtype StateT s m a = StateT (s -> m (a, s)) > fmap f (StateT a) = StateT b > where b = fmap_ m (a, s)> f a > = fmap_ f . a . cofmap_ f > = fmap_ f . a . id > = fmap (fmap_ f) . a > = fmap (fmap2 (fmap_ f) . fmap (fmap_ f)) . a > = fmap (fmap2 f . fmap id) . a > = fmap (fmap2 f) . a > = \s -> fmap (\(a,s) -> (f a, s)) (a s) Even for Cont: > newtype Cont r a = ContT ((a -> r) -> r) > fmap f (ContT a) = ContT b > where b = fmap_ r) -> r> f a > = fmap_ f . a . cofmap_ r> f > = id . a . cofmap_ r> f > = a . (\u -> cofmap_ f . u . fmap_ f) > = a . (\u -> id . u . f) > = a . (. f) There are some (minor) problems with this approach. First of all the treatment of (->) is rather ad-hoc, consider: > newtype Arrow a b = a -> b deriving (Functor, CoFunctor2) > data A a = A (T a -> ()) deriving Functor > data B a = B (Arrow (T a) ()) deriving Functor In the first case the derived instance is: > instance CoFunctor T => Functor A where > fmap f (A u) = A (u . cofmap f) While for the second type the following is derived: > instance (Functor T, Functor2 Arrow) => Functor B where > fmap f (B u) = fmap2 (fmap f) Consider also: > newtype Problem a = Problem (T (U a)) deriving Functor Now there are two possible functor instances, depending on the instances for T and U: > instance Functor Problem where fmap f = fmap (fmap f) > instance Functor Problem where fmap f = cofmap (cofmap f) Currently the algorithm chooses the former, it will only use CoFunctor if (->) is present, and it tries to get rid of it as soon as possible. This also comes up when trying to derive an instance for this variation of Cont: > data C r a = C ((a -> r, a -> r) -> r) deriving Functor Because it uses a type constructor in a contravariant position. The derivation goes as follows: > fmap f (C a) = C b > where b = fmap_ r, a -> r) -> r> > = fmap_ f . a . cofmap_ r, a -> r)> f > = id . a . cofmap_ r, a -> r)> f > = a . cofmap2 (fmap_r> f) . cofmap (fmap_r> f) > = a . cofmap2 (fmap_r> f) > . cofmap (\u -> cofmap_ f . u . fmap_ f) > = error, unable to realize: cofmap_ The desired instance would be: > ... = a . cofmap_ r, a -> r)> f > = a . fmap2 (cofmap_ r> f) > . fmap (cofmap_ r> f) > = a . fmap2 (cofmap_ r> f) > . fmap (\u -> cofmap_ f . u . fmap_ f) > = a . fmap2 (cofmap_ r> f) . fmap (.f) > = a . fmap2 (.f) . fmap (.f) > = \(x,y) -> a . (x . f, y . f) However, I highly doubt this problem will come up in practice. A 'solution' would be to replace: > cofmap_ f = cofmap2 (fmap_ f) . cofmap (fmap_ f) with > cofmap_ f = fmap2 (cofmap_ f) . fmap (cofmap_ f) Thereby removing all uses of CoFunctor. Maybe that would be a better definition? Finally, if Data.Foldable and Data.Traversable are added to the standard, they could be derived in a similair way. Twan van Laarhoven From ross at soi.city.ac.uk Sun Mar 11 08:10:39 2007 From: ross at soi.city.ac.uk (Ross Paterson) Date: Sun Mar 11 08:03:34 2007 Subject: Deriving Functor In-Reply-To: <45F06B73.3060802@gmail.com> References: <45F06B73.3060802@gmail.com> Message-ID: <20070311121039.GA3948@soi.city.ac.uk> On Thu, Mar 08, 2007 at 09:00:51PM +0100, Twan van Laarhoven wrote: > I would like to propose to add a way to automatically derive instances > of Functor. From looking at existing code, it seems that almost all > Functor instances I see are derivable using the algorithm presented > here, resulting in less boilerplate code. This proposal is compatible > with Haskell98 (and therefore also with Haskell'). I don't know if you've seen Ralf Hinze's "Polytypic values possess polykinded types", but the map example there is relevant. (Also to the semantics of newtype deriving that you posted a while ago.) http://www.informatik.uni-bonn.de/~ralf/publications/SCP.ps.gz Even if you handle contrapositive arguments (like the first argument of ->), there will still be things like newtype Endo a = Endo (a -> a) From igloo at earth.li Tue Mar 13 16:41:27 2007 From: igloo at earth.li (Ian Lynagh) Date: Tue Mar 13 16:41:31 2007 Subject: [GHC] #1215: GHC fails to respect the maximal munch rule while lexing "qualified reservedids" In-Reply-To: <080.e87201f5fa747725ec539495f335891e@localhost> References: <071.373fc89ca9b3a79eb71cb320c2a83ec6@localhost> <080.e87201f5fa747725ec539495f335891e@localhost> Message-ID: <20070313204127.GB9210@matrix.chaos.earth.li> Context if you haven't been following: http://hackage.haskell.org/trac/ghc/ticket/1215 On Tue, Mar 13, 2007 at 03:12:33PM -0000, GHC wrote: > > Interesting. It turns out I misinterpreted the Haskell lexical syntax: > GHC lexes `M.default` as `M` `.` `default`, because `M.default` is not a > valid qvarid but I neglected to take into account the maximal munch rule. > > We have an open ticket for Haskell' about this: > http://hackage.haskell.org/cgi-bin/haskell-prime/trac.cgi/wiki/QualifiedIdentifiers > which was until just now > inaccurate (I've now fixed it). I propose to fix GHC in 6.8 to match the > Haskell' proposal. If I understand correctly then the proposal would make e.g. foo = Bar.where a syntactically valid program, but one which would be guaranteed to fail to compile with a not-in-scope error? Wouldn't it be cleaner for it to be a lexical error? Unfortunately I'm not sure how to say this in the grammar; the best I can come up with is: program -> {lexeme | whitespace | error } error -> [ modid . ] reservedid Thanks Ian From simonmarhaskell at gmail.com Wed Mar 14 09:19:01 2007 From: simonmarhaskell at gmail.com (Simon Marlow) Date: Wed Mar 14 09:19:05 2007 Subject: [GHC] #1215: GHC fails to respect the maximal munch rule while lexing "qualified reservedids" In-Reply-To: <20070313204127.GB9210@matrix.chaos.earth.li> References: <071.373fc89ca9b3a79eb71cb320c2a83ec6@localhost> <080.e87201f5fa747725ec539495f335891e@localhost> <20070313204127.GB9210@matrix.chaos.earth.li> Message-ID: <45F7F645.80206@gmail.com> Ian Lynagh wrote: > Context if you haven't been following: > http://hackage.haskell.org/trac/ghc/ticket/1215 > > On Tue, Mar 13, 2007 at 03:12:33PM -0000, GHC wrote: >> Interesting. It turns out I misinterpreted the Haskell lexical syntax: >> GHC lexes `M.default` as `M` `.` `default`, because `M.default` is not a >> valid qvarid but I neglected to take into account the maximal munch rule. >> >> We have an open ticket for Haskell' about this: >> http://hackage.haskell.org/cgi-bin/haskell-prime/trac.cgi/wiki/QualifiedIdentifiers >> which was until just now >> inaccurate (I've now fixed it). I propose to fix GHC in 6.8 to match the >> Haskell' proposal. > > If I understand correctly then the proposal would make e.g. > > foo = Bar.where > > a syntactically valid program, but one which would be guaranteed to fail > to compile with a not-in-scope error? > > Wouldn't it be cleaner for it to be a lexical error? Unfortunately I'm > not sure how to say this in the grammar; the best I can come up with is: > > program -> {lexeme | whitespace | error } > error -> [ modid . ] reservedid Or make lexeme overlap with error, and do this: program -> { lexeme_ | whitespace } to make it clear that a valid program doesn't contain any error lexemes. But then people might wonder why the error production doesn't contain all the lexical errors. I don't really have a strong opinion here, but I lean towards not doing this, on the grounds that it's not strictly necessary and I'm a bit of a minimalist. The compiler is already free to report the error as a lexical error if it likes. Cheers, Simon From igloo at earth.li Fri Mar 16 11:50:38 2007 From: igloo at earth.li (Ian Lynagh) Date: Fri Mar 16 11:50:28 2007 Subject: strict bits of datatypes Message-ID: <20070316155038.GA32344@matrix.chaos.earth.li> Hi all, A while ago there was a discussion on haskell-cafe about the semantics of strict bits in datatypes that never reached a conclusion; I've checked with Malcolm and there is still disagreement about the right answer. The original thread is around here: http://www.haskell.org/pipermail/haskell-cafe/2006-October/018804.html but I will try to give all the relevant information in this message. The question is, given: data Fin a = FinCons a !(Fin a) | FinNil w = let q = FinCons 3 q in case q of FinCons i _ -> i is w 3 or _|_? ---------- The _|_ argument ---------- (Supporters include me, ghc and hugs) q = FinCons 3 q === (by Haskell 98 report 4.2.1/Strictness Flags/Translation q = (FinCons $ 3) $! q === (by definition of $, $!) q = q `seq` FinCons 3 q === (solution is least fixed point of the equation) q = _|_ Thus w = case _|_ of FinCons i _ -> i so w = _|_. ---------- The 3 argument ---------- (Supporters include Malcolm Wallace, nhc98 and yhc) Here I will just quote what Malcolm said in his original message: The definition of seq is seq _|_ b = _|_ seq a b = b, if a/= _|_ In the circular expression let q = FinCons 3 q in q it is clear that the second component of the FinCons constructor is not _|_ (it has at least a FinCons constructor), and therefore it does not matter what its full unfolding is. and in his recent e-mail to me: Yes, I still think this is a reasonable interpretation of the Report. I would phrase it as "After evaluating the constructor expression to WHNF, any strict fields contained in it are also be guaranteed to be in WHNF." This also makes q a fixpoint of q = q `seq` FinCons 3 q, but not the least fixed point. ---------- So I think it would be good if we can all agree on what the meaning should be, and then to clarify the wording in the report so that future readers understand it correctly too. Thanks Ian From apfelmus at quantentunnel.de Fri Mar 16 12:40:17 2007 From: apfelmus at quantentunnel.de (apfelmus@quantentunnel.de) Date: Fri Mar 16 12:41:10 2007 Subject: strict bits of datatypes In-Reply-To: <20070316155038.GA32344@matrix.chaos.earth.li> References: <20070316155038.GA32344@matrix.chaos.earth.li> Message-ID: Ian Lynagh wrote: > Here I will just quote what Malcolm said in his original message: > > The definition of seq is > seq _|_ b = _|_ > seq a b = b, if a/= _|_ > > In the circular expression > let q = FinCons 3 q in q > it is clear that the second component of the FinCons constructor is not > _|_ (it has at least a FinCons constructor), and therefore it does not > matter what its full unfolding is. Well, in a sense, it's exactly the defining property of strict constructors that they are not automatically different from _|_. The translation > q = FinCons 3 q > === (by Haskell 98 report 4.2.1/Strictness Flags/Translation > q = (FinCons $ 3) $! q is rather subtle: the first FinCons is a strict constructor whereas the second is "the real constructor". In other words, the translation loops as we could (should?) apply FinCons => \x y -> FinCons x $! y => \x y -> (\x' y' -> FinCons x' $! y') x $! y => ... ad infinitum. > and in his recent e-mail to me: > > Yes, I still think this is a reasonable interpretation of the Report. I > would phrase it as "After evaluating the constructor expression to WHNF, > any strict fields contained in it are also be guaranteed to be in WHNF." Referring to WHNF would break the report's preference of not committing to a particular evaluation strategy. That's already a good reason to stick with FinCons 3 _|_ = _|_. Besides, having let q = FinCons 3 q in q not being _|_ crucially depends on memoization. Even with the characterization by WHNF, let q x = FinCons 3 (q x) in q () is _|_. Regards, apfelmus From jon.fairbairn at cl.cam.ac.uk Fri Mar 16 13:00:15 2007 From: jon.fairbairn at cl.cam.ac.uk (=?utf-8?b?SsOzbiBGYWlyYmFpcm4=?=) Date: Fri Mar 16 13:00:38 2007 Subject: strict bits of datatypes References: <20070316155038.GA32344@matrix.chaos.earth.li> Message-ID: apfelmus@quantentunnel.de writes: > Besides, having > > let q = FinCons 3 q in q > > not being _|_ crucially depends on memoization. Does it? Mentally I translate that as let q = Y (\q -> FinCons 3 q) in q => Y (\q-> FinCons 3 q) => (\q -> FinCons 3 q) (Y (\q-> FinCons 3 q)) => FinCons 3 (Y (\q -> FinCons 3 q)) which, assuming a plausible lambda expression for FinCons, is a soluble term. > Even with the characterization by WHNF, > > let q x = FinCons 3 (q x) in q () > > is _|_. Again q = Y(\q x -> FinCons 3 (q x)) so q () => Y(\q x -> FinCons 3 (q x)) () => (\q x -> FinCons 3 (q x))(Y(\q x -> FinCons 3 (q x))) () => (\x -> FinCons 3 (Y(\q y-> FinCons 3 (q y)) x)) () => FinCons 3 (Y(\q x -> FinCons 3 (q x)) () ) which is in WHNF (and soluble too) -- J?n Fairbairn Jon.Fairbairn@cl.cam.ac.uk From ross at soi.city.ac.uk Fri Mar 16 14:10:20 2007 From: ross at soi.city.ac.uk (Ross Paterson) Date: Fri Mar 16 14:10:10 2007 Subject: strict bits of datatypes In-Reply-To: References: <20070316155038.GA32344@matrix.chaos.earth.li> Message-ID: <20070316181020.GA6582@soi.city.ac.uk> On Fri, Mar 16, 2007 at 05:40:17PM +0100, apfelmus@quantentunnel.de wrote: > The translation > > > q = FinCons 3 q > > === (by Haskell 98 report 4.2.1/Strictness Flags/Translation > > q = (FinCons $ 3) $! q > > is rather subtle: the first FinCons is a strict constructor whereas the > second is "the real constructor". In other words, the translation loops > as we could (should?) apply > > FinCons > => \x y -> FinCons x $! y > => \x y -> (\x' y' -> FinCons x' $! y') x $! y > => ... > > ad infinitum. Yes, perhaps that ought to be fixed. But even so, this clearly implies that FinCons 3 _|_ = _|_ and thus that q is _|_ and nhc98/yhc have a bug. From iavor.diatchki at gmail.com Fri Mar 16 14:16:39 2007 From: iavor.diatchki at gmail.com (Iavor Diatchki) Date: Fri Mar 16 14:16:28 2007 Subject: strict bits of datatypes In-Reply-To: <20070316181020.GA6582@soi.city.ac.uk> References: <20070316155038.GA32344@matrix.chaos.earth.li> <20070316181020.GA6582@soi.city.ac.uk> Message-ID: <5ab17e790703161116y31e0d13cmf4cb3a8a87afb94d@mail.gmail.com> Hello, I also think that the first version is the correct one (i.e., the result is _|_). -Iavor On 3/16/07, Ross Paterson wrote: > On Fri, Mar 16, 2007 at 05:40:17PM +0100, apfelmus@quantentunnel.de wrote: > > The translation > > > > > q = FinCons 3 q > > > === (by Haskell 98 report 4.2.1/Strictness Flags/Translation > > > q = (FinCons $ 3) $! q > > > > is rather subtle: the first FinCons is a strict constructor whereas the > > second is "the real constructor". In other words, the translation loops > > as we could (should?) apply > > > > FinCons > > => \x y -> FinCons x $! y > > => \x y -> (\x' y' -> FinCons x' $! y') x $! y > > => ... > > > > ad infinitum. > > Yes, perhaps that ought to be fixed. But even so, this clearly implies that > > FinCons 3 _|_ = _|_ > > and thus that q is _|_ and nhc98/yhc have a bug. > > _______________________________________________ > Haskell-prime mailing list > Haskell-prime@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-prime > From john at repetae.net Fri Mar 16 14:28:11 2007 From: john at repetae.net (John Meacham) Date: Fri Mar 16 14:28:00 2007 Subject: strict bits of datatypes In-Reply-To: References: <20070316155038.GA32344@matrix.chaos.earth.li> Message-ID: <20070316182811.GK4626@momenergy.repetae.net> On Fri, Mar 16, 2007 at 05:00:15PM +0000, J?n Fairbairn wrote: > Does it? Mentally I translate that as > > let q = Y (\q -> FinCons 3 q) in q > but it would actually translate to > let q = Y (\q -> q `seq` FinCons 3 q) in q for strict fields, whenever a constructor appears, it is translated to one which seq's its strict fields before creating the constructor. so, FinCons 3 q desugars to q `seq` FinCons 3 q wherever it appears, strict fields have no effect on deconstructing data types. John -- John Meacham - ?repetae.net?john? From apfelmus at quantentunnel.de Fri Mar 16 16:31:25 2007 From: apfelmus at quantentunnel.de (apfelmus@quantentunnel.de) Date: Fri Mar 16 16:35:57 2007 Subject: strict bits of datatypes In-Reply-To: References: <20070316155038.GA32344@matrix.chaos.earth.li> Message-ID: J?n Fairbairn wrote: > apfelmus@quantentunnel.de writes: > >> Besides, having >> >> let q = FinCons 3 q in q >> >> not being _|_ crucially depends on memoization. > > Does it? Sorry for having introduced an extra paragraph, I meant that q =/= _|_ under the new WHNF-rule would depend on memoization. At the memory location of q, hereby marked with *q, evaluation would yield *q: q => *q: FinCons 3 q Now, this can be considered "ok" according to the rule because the data at the location is WHNF and the second argument of FinsCons is WHNF as well because we just evaluated q to WHNF. By introducing an extra parameter, the memoization is gone and evaluation will yield q () => FinCons 3 (q ()) The point is that the second argument to FinCons is not WHNF, so we have to evaluate that further in order to generate only values that conform to the new WHNF-rule. Of course, this evaluation will diverge now. With the above, I want to show that the proposed new WHNF-rule gives non-_|_ values in very special cases only. I don't think that these are worth it. Regards, apfelmus PS: Your derivations are fine in the case of a non-strict FinCons. But the point is to make in strict. From apfelmus at quantentunnel.de Fri Mar 16 16:36:10 2007 From: apfelmus at quantentunnel.de (apfelmus@quantentunnel.de) Date: Fri Mar 16 16:39:57 2007 Subject: strict bits of datatypes In-Reply-To: <20070316181020.GA6582@soi.city.ac.uk> References: <20070316155038.GA32344@matrix.chaos.earth.li> <20070316181020.GA6582@soi.city.ac.uk> Message-ID: Ross Paterson wrote: > On Fri, Mar 16, 2007 at 05:40:17PM +0100, apfelmus wrote: >> the translation loops >> as we could (should?) apply >> >> FinCons >> => \x y -> FinCons x $! y >> => \x y -> (\x' y' -> FinCons x' $! y') x $! y >> => ... >> >> ad infinitum. > > Yes, perhaps that ought to be fixed. But even so, this clearly implies that > > FinCons 3 _|_ = _|_ > > and thus that q is _|_ and nhc98/yhc have a bug. Yes, I agree completely. I should have separated the observation that the rewrite rule for the translation of strict constructors loops from the business with q. Regards, apfelmus From lennart at augustsson.net Fri Mar 16 21:08:57 2007 From: lennart at augustsson.net (Lennart Augustsson) Date: Fri Mar 16 21:09:12 2007 Subject: strict bits of datatypes In-Reply-To: <20070316155038.GA32344@matrix.chaos.earth.li> References: <20070316155038.GA32344@matrix.chaos.earth.li> Message-ID: Given the translation of strict constructors I can't anything but _|_ as the answer. On Mar 16, 2007, at 15:50 , Ian Lynagh wrote: > > Hi all, > > A while ago there was a discussion on haskell-cafe about the semantics > of strict bits in datatypes that never reached a conclusion; I've > checked with Malcolm and there is still disagreement about the right > answer. The original thread is around here: > http://www.haskell.org/pipermail/haskell-cafe/2006-October/018804.html > but I will try to give all the relevant information in this message. > > The question is, given: > > data Fin a = FinCons a !(Fin a) | FinNil > > w = let q = FinCons 3 q > in case q of > FinCons i _ -> i > > is w 3 or _|_? > > ---------- The _|_ argument ---------- > > (Supporters include me, ghc and hugs) > > q = FinCons 3 q > === (by Haskell 98 report 4.2.1/Strictness Flags/Translation > q = (FinCons $ 3) $! q > === (by definition of $, $!) > q = q `seq` FinCons 3 q > === (solution is least fixed point of the equation) > q = _|_ > > Thus > > w = case _|_ of > FinCons i _ -> i > > so w = _|_. > > > ---------- The 3 argument ---------- > > (Supporters include Malcolm Wallace, nhc98 and yhc) > > Here I will just quote what Malcolm said in his original message: > > The definition of seq is > seq _|_ b = _|_ > seq a b = b, if a/= _|_ > > In the circular expression > let q = FinCons 3 q in q > it is clear that the second component of the FinCons > constructor is not > _|_ (it has at least a FinCons constructor), and therefore it > does not > matter what its full unfolding is. > > and in his recent e-mail to me: > > Yes, I still think this is a reasonable interpretation of the > Report. I > would phrase it as "After evaluating the constructor expression > to WHNF, > any strict fields contained in it are also be guaranteed to be > in WHNF." > > This also makes q a fixpoint of q = q `seq` FinCons 3 q, but not the > least fixed point. > > ---------- > > So I think it would be good if we can all agree on what the meaning > should be, and then to clarify the wording in the report so that > future > readers understand it correctly too. > > > Thanks > Ian > > _______________________________________________ > Haskell-prime mailing list > Haskell-prime@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-prime From jon.fairbairn at cl.cam.ac.uk Sun Mar 18 07:03:21 2007 From: jon.fairbairn at cl.cam.ac.uk (=?utf-8?b?SsOzbiBGYWlyYmFpcm4=?=) Date: Sun Mar 18 07:03:26 2007 Subject: strict bits of datatypes References: <20070316155038.GA32344@matrix.chaos.earth.li> Message-ID: apfelmus@quantentunnel.de writes: > J?n Fairbairn wrote: > > apfelmus@quantentunnel.de writes: > > > >> Besides, having > >> > >> let q = FinCons 3 q in q > >> > >> not being _|_ crucially depends on memoization. > > > > Does it? > PS: Your derivations are fine in the case of a non-strict FinCons. But > the point is to make in strict. Yes, I was trying to be subtle but was too sleepy and lost the plot. -- J?n Fairbairn Jon.Fairbairn@cl.cam.ac.uk From simonpj at microsoft.com Mon Mar 19 04:38:59 2007 From: simonpj at microsoft.com (Simon Peyton-Jones) Date: Mon Mar 19 04:38:54 2007 Subject: strict bits of datatypes In-Reply-To: <20070316182811.GK4626@momenergy.repetae.net> References: <20070316155038.GA32344@matrix.chaos.earth.li> <20070316182811.GK4626@momenergy.repetae.net> Message-ID: | FinCons 3 q | desugars to | | q `seq` FinCons 3 q wherever it appears, | | strict fields have no effect on deconstructing data types. That's GHC's behaviour too. I think it's the right one too! (It's certainly easy to explain.) Simon From ahey at iee.org Mon Mar 19 11:08:30 2007 From: ahey at iee.org (Adrian Hey) Date: Mon Mar 19 11:08:11 2007 Subject: strict bits of datatypes In-Reply-To: References: <20070316155038.GA32344@matrix.chaos.earth.li> <20070316182811.GK4626@momenergy.repetae.net> Message-ID: <45FEA76E.4080109@iee.org> Simon Peyton-Jones wrote: > | strict fields have no effect on deconstructing data types. > > That's GHC's behaviour too. I think it's the right one too! (It's certainly easy to explain.) This reminds me of something I discovered about using strict fields in AVL trees (with ghc). Using strict fields results in slower code than doing the `seq` desugaring by hand. If I have.. data AVL e = E | N !(AVL e) e !(AVL e) .. etc then presumably this.. case avl of N l e r -> N (f l) e r desugars to something like .. case avl of N l e r -> let l' = f l in l' `seq` r `seq` N l' e r but IMO it should desugar to.. case avl of N l e r -> let l' = f l in l' `seq` N l' e r which is what I ended up writing by hand all over the place (dropping the strictness annotation in the data type). That is, variables that have been obtained by matching strict fields (r in the case) should not be re-seqed if they are re-used in another strict context. Now this explanation for the slow down I observed is just speculation on my part (I don't actually know what ghc or any other compiler does). But on modern memory architectures, forcing the code to inspect heap records that it shouldn't have to inspect will be a bad thing. So semantically I agree with "strict fields have no effect on deconstructing data types", but they should have an effect on the code that an optimising compiler generates IMO. Regards -- Adrian Hey From simonpj at microsoft.com Mon Mar 19 11:22:29 2007 From: simonpj at microsoft.com (Simon Peyton-Jones) Date: Mon Mar 19 11:22:21 2007 Subject: strict bits of datatypes In-Reply-To: <45FEA76E.4080109@iee.org> References: <20070316155038.GA32344@matrix.chaos.earth.li> <20070316182811.GK4626@momenergy.repetae.net> <45FEA76E.4080109@iee.org> Message-ID: | This reminds me of something I discovered about using strict fields in | AVL trees (with ghc). Using strict fields results in slower code than | doing the `seq` desugaring by hand. That is bad. Can you send a test case that demonstrates this behaviour? | If I have.. | | data AVL e = E | | N !(AVL e) e !(AVL e) | .. etc | | then presumably this.. | | case avl of N l e r -> N (f l) e r | | desugars to something like .. | | case avl of N l e r -> let l' = f l | in l' `seq` r `seq` N l' e r | | but IMO it should desugar to.. | | case avl of N l e r -> let l' = f l | in l' `seq` N l' e r I agree. If it doesn't please let me know! Simon From igloo at earth.li Mon Mar 19 11:35:40 2007 From: igloo at earth.li (Ian Lynagh) Date: Mon Mar 19 11:35:20 2007 Subject: type aliases and Id Message-ID: <20070319153540.GA5028@matrix.chaos.earth.li> Hi all, Suppose I have a datatype: data Foo a = Foo { int :: a Int, char :: a Char } where I start off with (Foo Nothing Nothing) :: Foo Maybe, gradually accumulate values until I have (Foo (Just 5) (Just 'c')), and then I want to remove the Maybe type so I can lose all the now-redundant Just constructors. Well, suppose in actual fact I prefer the name "CanBe" to Maybe. Then for the first part I want type CanBe a = Maybe a foo :: Foo CanBe foo = ... but of course this fails because CanBe is a non-fully-applied type synonym in "foo :: Foo CanBe", and I can fix this by eta-reducing thus: type CanBe = Maybe foo :: Foo CanBe foo = ... Now for the second part I want type Id a = a foo' :: Foo Id foo' = ... but again Id is not fully applied. However, this time I cannot eta-reduce it! "type Id =" is a parse error, as is "type Id". I'd really like to be able to define an eta-reduced Id; I see two possibilities: * Allow "type Id =" (I prefer this to "type Id" as I think we are more likely to want to use the latter syntax for something else later on). * Implementations should eta-reduce all type synonyms as much as possible, e.g. type T a b c d = X a b Int c d is equivalent to type T a b = X a b Int and type Id a = a is equivalent to a type that cannot be expressed directly. Any opinions? Thanks Ian From ahey at iee.org Mon Mar 19 12:49:14 2007 From: ahey at iee.org (Adrian Hey) Date: Mon Mar 19 12:48:54 2007 Subject: strict bits of datatypes In-Reply-To: References: <20070316155038.GA32344@matrix.chaos.earth.li> <20070316182811.GK4626@momenergy.repetae.net> <45FEA76E.4080109@iee.org> Message-ID: <45FEBF0A.3030105@iee.org> Simon Peyton-Jones wrote: > | This reminds me of something I discovered about using strict fields in > | AVL trees (with ghc). Using strict fields results in slower code than > | doing the `seq` desugaring by hand. > > That is bad. Can you send a test case that demonstrates this behaviour? OK, I'll try to put something reasonably simple together to test this again. Last time I tested it was a probably a couple of years ago so what I said might not be true of current ghc. The effect wasn't huge in any case (using strict constructors was about 5% slower). Regards -- Adrian Hey From stefan at cs.uu.nl Mon Mar 19 14:57:05 2007 From: stefan at cs.uu.nl (Stefan Holdermans) Date: Mon Mar 19 14:56:45 2007 Subject: type aliases and Id In-Reply-To: <20070319153540.GA5028@matrix.chaos.earth.li> References: <20070319153540.GA5028@matrix.chaos.earth.li> Message-ID: Ian, Mmm... > * Allow "type Id =" (I prefer this to "type Id" as I think we are more > likely to want to use the latter syntax for something else later > on). Looks kind of funny; I'm not too thrilled. > * Implementations should eta-reduce all type synonyms as much as > possible, e.g. > type T a b c d = X a b Int c d > is equivalent to > type T a b = X a b Int > and > type Id a = a > is equivalent to a type that cannot be expressed directly. I like this alternatie a bit better, but I can also see how it introduces a lot of potential confusing, especially for novice Haskell programmers. You write something and the compiler goes along with something else... Maybe this will serve as a source of inspiration: http:// portal.acm.org/citation.cfm?doid=581478.581496 [1]. Cheers, Stefan [1] Matthias Neubauer and Peter Thiemann. Type classes with more higher-order poly- morphism. In Proceedings of the Seventh ACM SIGPLAN International Conference on Functional Programming (ICFP ?02), Pittsburgh, Pennsylvania, USA, October 4?-6, 2002, pages 179?-190. ACM Press, 2002. From ravi at bluespec.com Mon Mar 19 15:26:20 2007 From: ravi at bluespec.com (Ravi Nanavati) Date: Mon Mar 19 15:25:58 2007 Subject: type aliases and Id In-Reply-To: <20070319153540.GA5028@matrix.chaos.earth.li> References: <20070319153540.GA5028@matrix.chaos.earth.li> Message-ID: <7b977d860703191226p63e9e3d4w85122a281848670f@mail.gmail.com> On 3/19/07, Ian Lynagh wrote: > > I'd really like to be able to define an eta-reduced Id; I see two > possibilities: > > * Allow "type Id =" (I prefer this to "type Id" as I think we are more > likely to want to use the latter syntax for something else later on). > > * Implementations should eta-reduce all type synonyms as much as > possible, e.g. > type T a b c d = X a b Int c d > is equivalent to > type T a b = X a b Int > and > type Id a = a > is equivalent to a type that cannot be expressed directly. > > > Any opinions? A third possibility is to have "Id" be a special primitive type constructor of kind * -> * that implementations handle internally. If you wanted to give it different name you could use an eta-reduced type synonym for that, of course. That's the approach I took when I needed an identity type function in the Bluespec compiler, and that worked out reasonably well. Part of the reason that worked out, though, is that we already had a normalization point during typechecking where certain special type constructors (related to numeric types) were cleaned out, so adding Id just extended that a little. I don't know whether adding such a constructor would be an equally simple change for Haskell implementations. And there's the separate argument that requiring eta-reduction of all type synonyms might be an interesting new feature in its own right (since I think you can say other new things beyond type Id a = a). - Ravi -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/haskell-prime/attachments/20070319/f7d39f82/attachment.htm From lennart at augustsson.net Mon Mar 19 15:55:30 2007 From: lennart at augustsson.net (Lennart Augustsson) Date: Mon Mar 19 15:55:25 2007 Subject: type aliases and Id In-Reply-To: <7b977d860703191226p63e9e3d4w85122a281848670f@mail.gmail.com> References: <20070319153540.GA5028@matrix.chaos.earth.li> <7b977d860703191226p63e9e3d4w85122a281848670f@mail.gmail.com> Message-ID: Ravi, Ganesh and I were discussing today what would happen if one adds Id as a primitive type constructor. How much did you have to change the type checker? Presumably if you need to unify 'm a' with 'a' you now have to set m=Id. Do you know if you can run into higher order unification problems? My gut feeling is that with just Id, you probably don't, but I would not bet on it. Having Id would be cool. If we make an instance 'Monad Id' it's now possible to get rid of map and always use mapM instead. Similarly with other monadic functions. Did you do that in the Bluespec compiler? -- Lennart On Mar 19, 2007, at 19:26 , Ravi Nanavati wrote: > > > On 3/19/07, Ian Lynagh wrote: I'd really like to > be able to define an eta-reduced Id; I see two > possibilities: > > * Allow "type Id =" (I prefer this to "type Id" as I think we are more > likely to want to use the latter syntax for something else later > on). > > * Implementations should eta-reduce all type synonyms as much as > possible, e.g. > type T a b c d = X a b Int c d > is equivalent to > type T a b = X a b Int > and > type Id a = a > is equivalent to a type that cannot be expressed directly. > > > Any opinions? > > A third possibility is to have "Id" be a special primitive type > constructor of kind * -> * that implementations handle internally. > If you wanted to give it different name you could use an eta- > reduced type synonym for that, of course. > > That's the approach I took when I needed an identity type function > in the Bluespec compiler, and that worked out reasonably well. Part > of the reason that worked out, though, is that we already had a > normalization point during typechecking where certain special type > constructors (related to numeric types) were cleaned out, so adding > Id just extended that a little. > > I don't know whether adding such a constructor would be an equally > simple change for Haskell implementations. And there's the separate > argument that requiring eta-reduction of all type synonyms might be > an interesting new feature in its own right (since I think you can > say other new things beyond type Id a = a). > > - Ravi > _______________________________________________ > Haskell-prime mailing list > Haskell-prime@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-prime From iavor.diatchki at gmail.com Mon Mar 19 17:39:56 2007 From: iavor.diatchki at gmail.com (Iavor Diatchki) Date: Mon Mar 19 17:39:36 2007 Subject: type aliases and Id In-Reply-To: References: <20070319153540.GA5028@matrix.chaos.earth.li> <7b977d860703191226p63e9e3d4w85122a281848670f@mail.gmail.com> Message-ID: <5ab17e790703191439j31d3e439ia94c6f5d942e416a@mail.gmail.com> Hello, On 3/19/07, Lennart Augustsson wrote: > Ravi, > > Ganesh and I were discussing today what would happen if one adds Id > as a primitive type constructor. How much did you have to change the > type checker? Presumably if you need to unify 'm a' with 'a' you now > have to set m=Id. Do you know if you can run into higher order > unification problems? My gut feeling is that with just Id, you > probably don't, but I would not bet on it. It seems to me that even with just ''Id'' the problem is tricky. Suppose, for example, that we need to solve ''f x = g y''. In the present system we can reduce this to ''f = g'' and ''x = y'''. However, if we had ''Id'', then we would have to delay this equation until we know more about the variables that are involved (e.g., the correct solution might be ''f = Id'' and ''x = g y''). -Iavor From john at repetae.net Mon Mar 19 20:25:04 2007 From: john at repetae.net (John Meacham) Date: Mon Mar 19 20:24:43 2007 Subject: type aliases and Id In-Reply-To: References: <20070319153540.GA5028@matrix.chaos.earth.li> <7b977d860703191226p63e9e3d4w85122a281848670f@mail.gmail.com> Message-ID: <20070320002504.GA13500@momenergy.repetae.net> On Mon, Mar 19, 2007 at 07:55:30PM +0000, Lennart Augustsson wrote: > Ganesh and I were discussing today what would happen if one adds Id > as a primitive type constructor. How much did you have to change the > type checker? Presumably if you need to unify 'm a' with 'a' you now > have to set m=Id. Do you know if you can run into higher order > unification problems? My gut feeling is that with just Id, you > probably don't, but I would not bet on it. I have actually very much wanted this on a couple occasions. the main one being the 'annotation problem'. an abstract syntax tree you might want to annotate with different types of data. the current solution is something like data Ef f = EAp (f E) (f E) | ELam Var (f E) newtype Identity x = Identity x -- annotate with nothing type E = Ef Identity -- annotate with free variables type EFV = Ef ((,) (Set.Set Var)) etc... unfortunately, it makes the base case, when there are no annotations, very verbose. if we had (Id :: * -> *) then we could make type E = Ef Id and just pretend the annotations arn't there. > Having Id would be cool. If we make an instance 'Monad Id' it's now > possible to get rid of map and always use mapM instead. Similarly > with other monadic functions. > Did you do that in the Bluespec compiler? I don't see how declaring instances for such a type synonym would be possible. Type synonyms are fully expanded before any type checking. (they are basically just type-level macros). An unapplied 'Id' would not be able to expand to anything, so you would not be able to create an instance for it. This is a little odd in that the instance is properly kinded, yet still invalid. but I don't see that as a big issue, as there are other reasons instances can be invalid besides being improperly kinded.... John -- John Meacham - ?repetae.net?john? From simonpj at microsoft.com Tue Mar 20 08:00:13 2007 From: simonpj at microsoft.com (Simon Peyton-Jones) Date: Tue Mar 20 08:00:05 2007 Subject: type aliases and Id In-Reply-To: References: <20070319153540.GA5028@matrix.chaos.earth.li> <7b977d860703191226p63e9e3d4w85122a281848670f@mail.gmail.com> Message-ID: | Ganesh and I were discussing today what would happen if one adds Id | as a primitive type constructor. How much did you have to change the | type checker? Presumably if you need to unify 'm a' with 'a' you now | have to set m=Id. Do you know if you can run into higher order | unification problems? My gut feeling is that with just Id, you | probably don't, but I would not bet on it. | | Having Id would be cool. If we make an instance 'Monad Id' it's now | possible to get rid of map and always use mapM instead. Similarly | with other monadic functions. I remember that I have, more than once, devoted an hour or two to the question "could one add Id as a distinguished type constructor to Haskell". Sadly, each time I concluded "no". I'm prepared to be proved wrong. But here's the difficulty. Suppose we want to unify (m a) with (Tree Int) At the moment there's no problem: m=Tree, a=Int. But with Id another solution is m=Id, a=Tree Int And there are more m=Id, a=Id (Tree Int) We don't know which one to use until we see all the *other* uses of 'm' and 'a'. I have no clue how to solve this problem. Maybe someone else does. I agree that Id alone would be Jolly Useful, even without full type-level lambdas. Simon From Malcolm.Wallace at cs.york.ac.uk Tue Mar 20 09:53:47 2007 From: Malcolm.Wallace at cs.york.ac.uk (Malcolm Wallace) Date: Tue Mar 20 09:58:17 2007 Subject: strict bits of datatypes In-Reply-To: <20070316155038.GA32344@matrix.chaos.earth.li> References: <20070316155038.GA32344@matrix.chaos.earth.li> Message-ID: <20070320135347.0d77a14b.Malcolm.Wallace@cs.york.ac.uk> Ian Lynagh wrote: > data Fin a = FinCons a !(Fin a) | FinNil > w = let q = FinCons 3 q > in case q of > FinCons i _ -> i > > is w 3 or _|_? Knowing that opinions seem to be heavily stacked against my interpretation, nevertheless I'd like to try one more attempt at persuasion. The Haskell Report's definition of `seq` does _not_ imply an order of evaluation. Rather, it is a strictness annotation. (Whether this is the right thing to do is another cause of dissent, but let's accept the Report as is for now.) So `seq` merely gives a hint to the compiler that the value of its first argument must be established to be non-bottom, by the time that its second argument is examined by the calling context. The compiler is free to implement that guarantee in any way it pleases. So just as, in the expression x `seq` x one can immediately see that, if the second x is demanded, then the first one is also demanded, thus the `seq` can be elided - it is semantically identical to simply x Now, in the definition x = x `seq` foo one can also make the argument that, if the value of x (on the lhs of the defn) is demanded, then of course the x on the rhs of the defn is also demanded. There is no need for the `seq` here either. Semantically, the definition is equivalent to x = foo I am arguing that, as a general rule, eliding the `seq` in such a case is an entirely valid and correct transformation. The objection to this point of view is that if you have a definition x = x `seq` foo then, operationally, you have a loop, because to evaluate x, one must first evaluate x before evaluating foo. But as I said at the beginning, `seq` does _not_ imply order of evaluation, so the objection is not well-founded. Regards, Malcolm From robdockins at fastmail.fm Tue Mar 20 10:50:41 2007 From: robdockins at fastmail.fm (Robert Dockins) Date: Tue Mar 20 10:42:00 2007 Subject: strict bits of datatypes In-Reply-To: <20070320135347.0d77a14b.Malcolm.Wallace@cs.york.ac.uk> References: <20070316155038.GA32344@matrix.chaos.earth.li> <20070320135347.0d77a14b.Malcolm.Wallace@cs.york.ac.uk> Message-ID: <804B3207-C460-4CFF-BBFC-EE75B1368C34@fastmail.fm> On Mar 20, 2007, at 9:53 AM, Malcolm Wallace wrote: > Ian Lynagh wrote: > >> data Fin a = FinCons a !(Fin a) | FinNil >> w = let q = FinCons 3 q >> in case q of >> FinCons i _ -> i >> >> is w 3 or _|_? > > Knowing that opinions seem to be heavily stacked against my > interpretation, nevertheless I'd like to try one more attempt at > persuasion. > > The Haskell Report's definition of `seq` does _not_ imply an order of > evaluation. Rather, it is a strictness annotation. (Whether this is > the right thing to do is another cause of dissent, but let's accept > the > Report as is for now.) So `seq` merely gives a hint to the compiler > that the value of its first argument must be established to be > non-bottom, by the time that its second argument is examined by the > calling context. The compiler is free to implement that guarantee in > any way it pleases. > > So just as, in the expression > x `seq` x > one can immediately see that, if the second x is demanded, then the > first one is also demanded, thus the `seq` can be elided - it is > semantically identical to simply > x > > Now, in the definition > x = x `seq` foo > one can also make the argument that, if the value of x (on the lhs of > the defn) is demanded, then of course the x on the rhs of the defn is > also demanded. There is no need for the `seq` here either. > Semantically, the definition is equivalent to > x = foo > I am arguing that, as a general rule, eliding the `seq` in such a case > is an entirely valid and correct transformation. > > The objection to this point of view is that if you have a definition > x = x `seq` foo > then, operationally, you have a loop, because to evaluate x, one must > first evaluate x before evaluating foo. But as I said at the > beginning, > `seq` does _not_ imply order of evaluation, so the objection is not > well-founded. I just want to say that the argument I find most convincing is the 'least fixpoint' argument, which does not at all require assumptions about order of evaluation. I see your arguments as something along the lines "the non-bottom answer is a fixpoint of the equations, and therefore it is the correct answer". And, it is in fact _a_ fixpoint; but it is not the _least_ fixpoint, which would be bottom. To recap, the equation in question is: q = seq q (RealFinCons 3 q) It is not hard to see that q := _|_ is a fixpoint. Also, we have that q := LUB( _|_, RealFinCons 3 _|_, RealFinCons 3 (RealFinCons 3 _|_), ... ) is a fixpoint and is <> _|_. But that doesn't matter since _|_ is the least element of the domain and must therefore be the least fixpoint. > Regards, > Malcolm Rob Dockins Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank. -- TMBG From igloo at earth.li Tue Mar 20 10:55:26 2007 From: igloo at earth.li (Ian Lynagh) Date: Tue Mar 20 10:55:03 2007 Subject: strict bits of datatypes In-Reply-To: <20070320135347.0d77a14b.Malcolm.Wallace@cs.york.ac.uk> References: <20070316155038.GA32344@matrix.chaos.earth.li> <20070320135347.0d77a14b.Malcolm.Wallace@cs.york.ac.uk> Message-ID: <20070320145526.GA23752@matrix.chaos.earth.li> On Tue, Mar 20, 2007 at 01:53:47PM +0000, Malcolm Wallace wrote: > > Now, in the definition > x = x `seq` foo > one can also make the argument that, if the value of x (on the lhs of > the defn) is demanded, then of course the x on the rhs of the defn is > also demanded. There is no need for the `seq` here either. > Semantically, the definition is equivalent to > x = foo > I am arguing that, as a general rule, eliding the `seq` in such a case > is an entirely valid and correct transformation. So does nhc98 print "Foo" for this program? main = putStrLn $ let x = x `seq` "Foo" in x (yhc tells me my program has deadlocked, but my recent attempt to compile nhc98 failed so I can't check it). I don't fully understand what your interpretation is; is it also true that y = x x = y `seq` foo is equivalent to y = x x = foo ? And is it true that y = if True then x else undefined x = y `seq` foo is equivalent to y = x x = foo ? > The objection to this point of view is that if you have a definition > x = x `seq` foo > then, operationally, you have a loop, because to evaluate x, one must > first evaluate x before evaluating foo. But as I said at the beginning, > `seq` does _not_ imply order of evaluation, so the objection is not > well-founded. I'm having trouble finding a non-operational description of the behaviour I think seq should have. (Nor, for that matter, can I think of a description that makes it clear that it has the semantics that you think it should have). Anyone? I think you could make a similar argument that let x = x in x :: () is () rather than _|_, and similarly let x = x in x :: Int is 3, or is there some key difference I'm missing? Thanks Ian From bjpop at csse.unimelb.edu.au Tue Mar 20 11:21:51 2007 From: bjpop at csse.unimelb.edu.au (Bernie Pope) Date: Tue Mar 20 11:21:52 2007 Subject: strict bits of datatypes In-Reply-To: <20070320135347.0d77a14b.Malcolm.Wallace@cs.york.ac.uk> References: <20070316155038.GA32344@matrix.chaos.earth.li> <20070320135347.0d77a14b.Malcolm.Wallace@cs.york.ac.uk> Message-ID: <005d01c76b03$83876170$8a962450$@unimelb.edu.au> Malcolm wrote: > The Haskell Report's definition of `seq` does _not_ imply an order of > evaluation. Rather, it is a strictness annotation. That is an important point. > Now, in the definition > x = x `seq` foo > one can also make the argument that, if the value of x (on the lhs of > the defn) is demanded, then of course the x on the rhs of the defn is > also demanded. There is no need for the `seq` here either. > Semantically, the definition is equivalent to > x = foo > I am arguing that, as a general rule, eliding the `seq` in such a case > is an entirely valid and correct transformation. I think it is possible that both camps are right on this issue, as far as Haskell 98 stands. We can translate the definition of x into: x = fix (\y -> seq y foo) Isn't it the case that, denotationally, _|_ and foo are valid interpretations of the rhs? If we want to choose between them then we need something extra, such as an operational semantics, or a rule saying that we prefer the least solution. Perhaps I am just re-stating what Ian wrote in the beginning :) Cheers, Bernie. From claus.reinke at talk21.com Tue Mar 20 11:37:16 2007 From: claus.reinke at talk21.com (Claus Reinke) Date: Tue Mar 20 11:36:57 2007 Subject: strict bits of datatypes References: <20070316155038.GA32344@matrix.chaos.earth.li> <20070320135347.0d77a14b.Malcolm.Wallace@cs.york.ac.uk> Message-ID: <013901c76b05$a45e4b00$9b328351@cr3lt> x `seq` x === x and so x = x `seq` x === x = x but, in general, x = x `seq` foo =/= x = foo consider x = x `seq` ((1:) x) x = x `seq` ((1:) (x `seq` ((1:) x))) x = x `seq` ((1:) (x `seq` ((1:) (x `seq` ((1:) x))))) .. ignoring evaluation order, partially unrolling/substituting x also unrolls/substitutes applications of seq (in proper call-by-need, we couldn't even do the substitution until after the evaluation, so we'd never get this far). no part of the right-hand side is defined unless x is - that is what seq is about, isn't it? i do like the argument about different fixpoints - does that correspond to inductive vs coinductive definitions which are so often mixed in haskell? when working with cyclic structures, we often are quite happy without a base case. so instead of the inductive f 0 = True f n = f (n-1) -- provided that (f (n-1)) is defined, (f n) is defined we have things like nats = 1:map (+1) nats -- nats is partially defined if we just assume that nats is partially defined and it is nice to have both available, if not all that well separated. whether we're talking co-recursion or recursion seems to depend entirely on whether the recursive references are in a non-strict or strict context (so that the definitions are productive or not). so it seems to me that adding seq to a (co-)recursion is precisely about resolving this ambiguity and forcing induction nats = (1:) $! map (+1) nats -- nats is defined if we can prove that nats is defined, which we can't anymore and saying that we can get back to co-induction by eliding the seq may be correct, but irrelevant. and the same reasoning ought to apply to strict fields in data constructors. sorry for waving about precisely defined terms in such a naive manner. i hope it helps, and isn't too far of the mark!-) claus From carette at mcmaster.ca Tue Mar 20 11:41:58 2007 From: carette at mcmaster.ca (Jacques Carette) Date: Tue Mar 20 11:44:07 2007 Subject: type aliases and Id In-Reply-To: References: <20070319153540.GA5028@matrix.chaos.earth.li> <7b977d860703191226p63e9e3d4w85122a281848670f@mail.gmail.com> Message-ID: <460000C6.20001@mcmaster.ca> There is a general solution, but it essentially involves polymorphism a-la-Omega (or as in Coq's Calculus of Inductive Constructions). The most general description of (Tree Int) is as the Stream S = [Int, Tree, Id, Id, ...] You are now attempting to pull-off exactly 2 "terms" from that Stream. The solutions are: 0: Int, Tree 1: Tree Int, Id 2: Id (Tree Int), Id 3: Id (Id (Tree Int)), Id Let T@i denote n-ary type-level application, where T is a list of types, and i>=0. Consider the pair ( S!!(i+1), (take i S)@i) This is the /closed-form/ for the n'th solution (m, a) for unifying (m a) with (Tree Int). A better way to _represent_ this closed form is as (S, i) where S = [Int,Tree, Id, Id, ...] from which further constraints can decide what is the 'proper' value of i to take. This even shows how to do defaulting: in the absence of further information, take the smallest i possible. [I phrase it this way because there are times where constraints will force a certain minimum on i, but no maximum]. In other words, the above should be backwards compatible with current behaviour, since the 'default' solution would be m=Tree, a=Int. Jacques Simon Peyton-Jones wrote: > I remember that I have, more than once, devoted an hour or two to the question "could one add Id as a distinguished type constructor to Haskell". Sadly, each time I concluded "no". > > I'm prepared to be proved wrong. But here's the difficulty. Suppose we want to unify > (m a) with (Tree Int) > > At the moment there's no problem: m=Tree, a=Int. But with Id another solution is > m=Id, a=Tree Int > > And there are more > m=Id, a=Id (Tree Int) > > We don't know which one to use until we see all the *other* uses of 'm' and 'a'. > > I have no clue how to solve this problem. Maybe someone else does. I agree that Id alone would be Jolly Useful, even without full type-level lambdas. > > Simon > _______________________________________________ > Haskell-prime mailing list > Haskell-prime@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-prime > From ravi at bluespec.com Tue Mar 20 14:35:15 2007 From: ravi at bluespec.com (Ravi Nanavati) Date: Tue Mar 20 14:34:51 2007 Subject: type aliases and Id In-Reply-To: References: <20070319153540.GA5028@matrix.chaos.earth.li> <7b977d860703191226p63e9e3d4w85122a281848670f@mail.gmail.com> Message-ID: <7b977d860703201135h4633ed4tc59b9d27d075afb8@mail.gmail.com> On 3/19/07, Lennart Augustsson wrote: > > Ravi, > > Ganesh and I were discussing today what would happen if one adds Id > as a primitive type constructor. How much did you have to change the > type checker? All I did is add an expansion rule for my Id constructor to the place where the SizeOf pseudo-constructor was expanded. This basically means that if the typechecker sees Id applied so something, it is simplified out (as part of type-normalization / synonym expansion). Presumably if you need to unify 'm a' with 'a' you now > have to set m=Id. Do you know if you can run into higher order > unification problems? My gut feeling is that with just Id, you > probably don't, but I would not bet on it. I didn't change the unification rules, so 'm a', would NOT unify with 'a' with m=Id. That was mainly because I didn't need that unification rule to handle the case where I used the Id constructor. Having Id would be cool. If we make an instance 'Monad Id' it's now > possible to get rid of map and always use mapM instead. Similarly > with other monadic functions. > Did you do that in the Bluespec compiler? I did experiment with Monad Id while trying to sort something else out, and it seemed to work (the instance typechecked as well as some code that depended on that instance), but I didn't push it very hard. Beyond that, in response to your note, I tried to see what would happen if I added a unification rule for 'm a' with 'a', with m=Id. There turned out to be three important details (to keep existing Bluespec code working): - make sure to choose to unify 'm a' with a separate type variable 'b' directly, in preference to setting m=Id and unifying a with b. Upon reflection, that seemed reasonable because it didn't prevent setting m=Id later, in response to additional information. - make sure not to replace m with Id, if m is a lexically-bound type variable - make sure to expand type synonyms before attempting this sort of unification After that I could typecheck simple things like the following (using an inferred identity monad): x :: Integer x = return 5 y :: List Integer y = mapM (const 5) (cons 1 (cons 2 nil)) However, I did run into the problems Iavor and Simon mentioned. For example, the following program did not typecheck: z :: Bool -> Maybe (Integer) z p1 = let f a0 b0 = let a = return a0 b = return b0 in if p1 then a else b in f 5 (Just 6) However, the same could *would* typecheck if I added the following type annotations: z :: Bool -> Maybe (Integer) z p1 = let f a0 b0 = let a = return (a0 :: Integer) b = return (b0 :: Maybe Integer) in if p1 then a else b in f 5 (Just 6) So, I'd say the results were intriguing but not terribly satisfying. My guess is many common uses would work out, but there would be hard-to-explain corner cases where the typechecker would need guidance or a desired level of polymorphism couldn't be achieved. - Ravi -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/haskell-prime/attachments/20070320/d2bd6538/attachment-0001.htm From nad at cs.chalmers.se Tue Mar 20 14:59:14 2007 From: nad at cs.chalmers.se (Nils Anders Danielsson) Date: Tue Mar 20 14:58:52 2007 Subject: strict bits of datatypes In-Reply-To: <005d01c76b03$83876170$8a962450$@unimelb.edu.au> (Bernie Pope's message of "Tue, 20 Mar 2007 15:21:51 -0000") References: <20070316155038.GA32344@matrix.chaos.earth.li> <20070320135347.0d77a14b.Malcolm.Wallace@cs.york.ac.uk> <005d01c76b03$83876170$8a962450$@unimelb.edu.au> Message-ID: On Tue, 20 Mar 2007, "Bernie Pope" wrote: > Malcolm wrote: >> x = x `seq` foo > We can translate the definition of x into: > > x = fix (\y -> seq y foo) > > Isn't it the case that, denotationally, _|_ and foo are valid > interpretations of the rhs? > > If we want to choose between them then we need something extra, such as an > operational semantics, or a rule saying that we prefer the least solution. We already have such a rule. According to the H98 report the let expression let x = x `seq` foo in ... can be translated into let x = fix (\~y -> y `seq` foo) in ... where "fix is the least fixpoint operator". Now, since seq ? x = ? we get that ? is the least fixpoint of (\~y -> y `seq` foo), so the above _is_ equivalent to let x = ? in ... (If x is also, by some other reading of the report, equivalent to some non-bottom expression, then the report should be fixed.) -- /NAD From lennart at augustsson.net Tue Mar 20 16:25:12 2007 From: lennart at augustsson.net (Lennart Augustsson) Date: Tue Mar 20 16:25:02 2007 Subject: strict bits of datatypes In-Reply-To: <20070320135347.0d77a14b.Malcolm.Wallace@cs.york.ac.uk> References: <20070316155038.GA32344@matrix.chaos.earth.li> <20070320135347.0d77a14b.Malcolm.Wallace@cs.york.ac.uk> Message-ID: So we have an equation x = seq x foo In Haskell recursive equations are solved and you get the smallest fix point. The smallest solution to this equation is x = _|_, there are other solutions, but why should we deviate from giving the smallest fix point for seq when we do it for everything else? -- Lennart On Mar 20, 2007, at 13:53 , Malcolm Wallace wrote: > Ian Lynagh wrote: > >> data Fin a = FinCons a !(Fin a) | FinNil >> w = let q = FinCons 3 q >> in case q of >> FinCons i _ -> i >> >> is w 3 or _|_? > > Knowing that opinions seem to be heavily stacked against my > interpretation, nevertheless I'd like to try one more attempt at > persuasion. > > The Haskell Report's definition of `seq` does _not_ imply an order of > evaluation. Rather, it is a strictness annotation. (Whether this is > the right thing to do is another cause of dissent, but let's accept > the > Report as is for now.) So `seq` merely gives a hint to the compiler > that the value of its first argument must be established to be > non-bottom, by the time that its second argument is examined by the > calling context. The compiler is free to implement that guarantee in > any way it pleases. > > So just as, in the expression > x `seq` x > one can immediately see that, if the second x is demanded, then the > first one is also demanded, thus the `seq` can be elided - it is > semantically identical to simply > x > > Now, in the definition > x = x `seq` foo > one can also make the argument that, if the value of x (on the lhs of > the defn) is demanded, then of course the x on the rhs of the defn is > also demanded. There is no need for the `seq` here either. > Semantically, the definition is equivalent to > x = foo > I am arguing that, as a general rule, eliding the `seq` in such a case > is an entirely valid and correct transformation. > > The objection to this point of view is that if you have a definition > x = x `seq` foo > then, operationally, you have a loop, because to evaluate x, one must > first evaluate x before evaluating foo. But as I said at the > beginning, > `seq` does _not_ imply order of evaluation, so the objection is not > well-founded. > > Regards, > Malcolm > _______________________________________________ > Haskell-prime mailing list > Haskell-prime@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-prime From ross at soi.city.ac.uk Tue Mar 20 16:39:07 2007 From: ross at soi.city.ac.uk (Ross Paterson) Date: Tue Mar 20 16:39:05 2007 Subject: strict bits of datatypes In-Reply-To: <20070320135347.0d77a14b.Malcolm.Wallace@cs.york.ac.uk> References: <20070316155038.GA32344@matrix.chaos.earth.li> <20070320135347.0d77a14b.Malcolm.Wallace@cs.york.ac.uk> Message-ID: <20070320203907.GA6499@soi.city.ac.uk> On Tue, Mar 20, 2007 at 01:53:47PM +0000, Malcolm Wallace wrote: > Now, in the definition > x = x `seq` foo > one can also make the argument that, if the value of x (on the lhs of > the defn) is demanded, then of course the x on the rhs of the defn is > also demanded. There is no need for the `seq` here either. > Semantically, the definition is equivalent to > x = foo > I am arguing that, as a general rule, eliding the `seq` in such a case > is an entirely valid and correct transformation. You're talking about demand, WHNF, etc, but the Report doesn't; it gives a simple denotational semantics for seq and recursive definitions, according to which the first definition is equivalent to x = _|_. From lennart at augustsson.net Tue Mar 20 18:58:04 2007 From: lennart at augustsson.net (Lennart Augustsson) Date: Tue Mar 20 18:57:51 2007 Subject: type aliases and Id In-Reply-To: References: <20070319153540.GA5028@matrix.chaos.earth.li> <7b977d860703191226p63e9e3d4w85122a281848670f@mail.gmail.com> Message-ID: I don't think you need to produce 'a=Id (Tree Int)' since that reduces to 'a=Tree Int'. In general, you don't have to produce Id applied to anything, which gives me some hope that it's possible to add Id and still have decidable (and complete) type deduction. Perhaps a good topic for a research paper? -- Lennart On Mar 20, 2007, at 12:00 , Simon Peyton-Jones wrote: > | Ganesh and I were discussing today what would happen if one adds Id > | as a primitive type constructor. How much did you have to change > the > | type checker? Presumably if you need to unify 'm a' with 'a' you > now > | have to set m=Id. Do you know if you can run into higher order > | unification problems? My gut feeling is that with just Id, you > | probably don't, but I would not bet on it. > | > | Having Id would be cool. If we make an instance 'Monad Id' it's now > | possible to get rid of map and always use mapM instead. Similarly > | with other monadic functions. > > I remember that I have, more than once, devoted an hour or two to > the question "could one add Id as a distinguished type constructor > to Haskell". Sadly, each time I concluded "no". > > I'm prepared to be proved wrong. But here's the difficulty. > Suppose we want to unify > (m a) with (Tree Int) > > At the moment there's no problem: m=Tree, a=Int. But with Id > another solution is > m=Id, a=Tree Int > > And there are more > m=Id, a=Id (Tree Int) > > We don't know which one to use until we see all the *other* uses of > 'm' and 'a'. > > I have no clue how to solve this problem. Maybe someone else > does. I agree that Id alone would be Jolly Useful, even without > full type-level lambdas. > > Simon From simonpj at microsoft.com Wed Mar 21 04:36:30 2007 From: simonpj at microsoft.com (Simon Peyton-Jones) Date: Wed Mar 21 04:36:12 2007 Subject: type aliases and Id In-Reply-To: References: <20070319153540.GA5028@matrix.chaos.earth.li> <7b977d860703191226p63e9e3d4w85122a281848670f@mail.gmail.com> Message-ID: | I don't think you need to produce 'a=Id (Tree Int)' since that | reduces to 'a=Tree Int'. | In general, you don't have to produce Id applied to anything, which | gives me some hope that it's possible to add Id and still have | decidable (and complete) type deduction. Yes, that's true. But I still don't know how to do inference. Consider f :: forall m a. m a -> a -> [a] t :: Tree Int and consider the call f t t Well, this is perfectly well typed thus (I'll add the type applications to make it totally clear): f Id (Tree Int) t t That is, instantiate m=Id, a=Tree Int, and voila. The trouble is, when unifying (m a) = (Tree Int), it's very unclear what to do. Hmm. I suppose you might defer such unifications, instead gathering them as constraints, and solving them only when you quantify. That's the standard way to deal with tricky unification problems. It's certainly a nice challenge. Simon | | Perhaps a good topic for a research paper? | | -- Lennart | | On Mar 20, 2007, at 12:00 , Simon Peyton-Jones wrote: | | > | Ganesh and I were discussing today what would happen if one adds Id | > | as a primitive type constructor. How much did you have to change | > the | > | type checker? Presumably if you need to unify 'm a' with 'a' you | > now | > | have to set m=Id. Do you know if you can run into higher order | > | unification problems? My gut feeling is that with just Id, you | > | probably don't, but I would not bet on it. | > | | > | Having Id would be cool. If we make an instance 'Monad Id' it's now | > | possible to get rid of map and always use mapM instead. Similarly | > | with other monadic functions. | > | > I remember that I have, more than once, devoted an hour or two to | > the question "could one add Id as a distinguished type constructor | > to Haskell". Sadly, each time I concluded "no". | > | > I'm prepared to be proved wrong. But here's the difficulty. | > Suppose we want to unify | > (m a) with (Tree Int) | > | > At the moment there's no problem: m=Tree, a=Int. But with Id | > another solution is | > m=Id, a=Tree Int | > | > And there are more | > m=Id, a=Id (Tree Int) | > | > We don't know which one to use until we see all the *other* uses of | > 'm' and 'a'. | > | > I have no clue how to solve this problem. Maybe someone else | > does. I agree that Id alone would be Jolly Useful, even without | > full type-level lambdas. | > | > Simon From john at repetae.net Wed Mar 21 16:55:28 2007 From: john at repetae.net (John Meacham) Date: Wed Mar 21 16:55:01 2007 Subject: strict bits of datatypes In-Reply-To: References: <20070316155038.GA32344@matrix.chaos.earth.li> <20070316182811.GK4626@momenergy.repetae.net> <45FEA76E.4080109@iee.org> Message-ID: <20070321205528.GN4626@momenergy.repetae.net> On Mon, Mar 19, 2007 at 03:22:29PM +0000, Simon Peyton-Jones wrote: > | This reminds me of something I discovered about using strict fields in > | AVL trees (with ghc). Using strict fields results in slower code than > | doing the `seq` desugaring by hand. > > That is bad. Can you send a test case that demonstrates this behaviour? > > | If I have.. > | > | data AVL e = E > | | N !(AVL e) e !(AVL e) > | .. etc > | > | then presumably this.. > | > | case avl of N l e r -> N (f l) e r > | > | desugars to something like .. > | > | case avl of N l e r -> let l' = f l > | in l' `seq` r `seq` N l' e r > | > | but IMO it should desugar to.. > | > | case avl of N l e r -> let l' = f l > | in l' `seq` N l' e r > > I agree. If it doesn't please let me know! > Although I have not looked into this much, My guess is it is an issue in the simplifier, normally when something is examined with a case statement, the simplification context sets its status to 'NoneOf []', which means we know it is in WHNF, but we don't have any more info about it. I would think that the solution would be to add the same annotation in the simplifier to variables bound by pattern matching on strict data types? Just a theory. I am not sure how to debug this in ghc without digging into it's code. John -- John Meacham - ?repetae.net?john? From oleg at pobox.com Wed Mar 21 21:21:44 2007 From: oleg at pobox.com (oleg@pobox.com) Date: Wed Mar 21 21:22:45 2007 Subject: type aliases and Id Message-ID: <20070322012144.AB0AAAD35@Adric.metnet.fnmoc.navy.mil> Lennart Augustsson wrote: > Ganesh and I were discussing today what would happen if one adds Id > as a primitive type constructor. How much did you have to change the > type checker? Presumably if you need to unify 'm a' with 'a' you now > have to set m=Id. I wonder if this proposal is a good idea. Let us consider the following near-Haskell98 code > class C a > instance C (m a) > instance C Int (usually there will be constraints on m. We shall see a useful example of this in a moment). This code typechecks under slight and common relaxation of the rules on the form of instance head. That example will probably typecheck in Haskell'. Under the proposal that Id t === t, typechecking of this code will require overlapping instances, which quite unlikely to make it into Haskell' and is a quite significant and controversial extension. Speaking of overlapping instances, let's generalize the example to > class C a > instance C (m a) > instance C a It does typecheck in current Haskell. Under the Id proposal, this example will NOT typecheck, ever. This is because the two instances become exact duplicates. Indeed, every type that matches "a" will match "m a" (with m = Id) and vice versa. The two instances match the same class of types (that is, all of the types). Let us come back to our simple example and make it practical. > class C a where incr :: a -> a > > instance (C a, Functor m) => C (m a) where incr = fmap incr > instance C Int where incr = succ > > test = incr (Just [[[1::Int]]]) the example increments integers deeply embedded in some functorial data structures. The operations of this kind are requested from time to time on Haskell-Cafe. This code compiles and works (no overlapping or undecidable instances are required). Under the Id t === t proposal, this example will diverge (perhaps, it will diverge even in the compiler). The reason is that the base case cannot be reached; the type Int can always be considered as Id Int and so the first instance will apply again. From lennart at augustsson.net Thu Mar 22 04:00:43 2007 From: lennart at augustsson.net (Lennart Augustsson) Date: Thu Mar 22 04:00:21 2007 Subject: type aliases and Id In-Reply-To: <20070322012144.AB0AAAD35@Adric.metnet.fnmoc.navy.mil> References: <20070322012144.AB0AAAD35@Adric.metnet.fnmoc.navy.mil> Message-ID: <8BA2CBD1-6442-4C76-A6EF-2EF41A2E3F18@augustsson.net> A very good point. But that just makes a design that could include Id even more intriguing. :) -- Lennart On Mar 22, 2007, at 01:21 , oleg@pobox.com wrote: > > Lennart Augustsson wrote: >> Ganesh and I were discussing today what would happen if one adds Id >> as a primitive type constructor. How much did you have to change the >> type checker? Presumably if you need to unify 'm a' with 'a' you now >> have to set m=Id. > > I wonder if this proposal is a good idea. > > Let us consider the following near-Haskell98 code > >> class C a >> instance C (m a) >> instance C Int > > (usually there will be constraints on m. We shall see a useful > example of > this in a moment). This code typechecks under slight and common > relaxation of the rules on the form of instance head. That example > will probably typecheck in Haskell'. > Under the proposal that Id t === t, typechecking > of this code will require overlapping instances, which quite > unlikely to > make it into Haskell' and is a quite significant and controversial > extension. > > Speaking of overlapping instances, let's generalize the example to > >> class C a >> instance C (m a) >> instance C a > > It does typecheck in current Haskell. Under the Id proposal, this > example will NOT typecheck, ever. This is because the two instances > become exact duplicates. Indeed, every type that matches "a" will > match "m a" (with m = Id) and vice versa. The two instances match the > same class of types (that is, all of the types). > > Let us come back to our simple example and make it practical. > >> class C a where incr :: a -> a >> >> instance (C a, Functor m) => C (m a) where incr = fmap incr >> instance C Int where incr = succ >> >> test = incr (Just [[[1::Int]]]) > > the example increments integers deeply embedded in some functorial > data structures. The operations of this kind are requested from time > to time on Haskell-Cafe. This code compiles and works (no overlapping > or undecidable instances are required). Under the Id t === t > proposal, this example will diverge (perhaps, it will diverge even in > the compiler). The reason is that the base case cannot be reached; the > type Int can always be considered as Id Int and so the first instance > will apply again. > > _______________________________________________ > Haskell-prime mailing list > Haskell-prime@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-prime From simonpj at microsoft.com Thu Mar 22 04:07:59 2007 From: simonpj at microsoft.com (Simon Peyton-Jones) Date: Thu Mar 22 04:07:32 2007 Subject: strict bits of datatypes In-Reply-To: <20070321205528.GN4626@momenergy.repetae.net> References: <20070316155038.GA32344@matrix.chaos.earth.li> <20070316182811.GK4626@momenergy.repetae.net> <45FEA76E.4080109@iee.org> <20070321205528.GN4626@momenergy.repetae.net> Message-ID: | Although I have not looked into this much, My guess is it is an issue in | the simplifier, normally when something is examined with a case | statement, the simplification context sets its status to 'NoneOf []', | which means we know it is in WHNF, but we don't have any more info about | it. I would think that the solution would be to add the same annotation | in the simplifier to variables bound by pattern matching on strict data | types? Indeed! GHC already does this. That's why I am surprised that adding seq improves things. Adrian has helpfully sent a test case, but I have not examined it yet S