# Simonpj/Talk:ListComp

### From HaskellWiki

Line 174: | Line 174: | ||

When I first started reading the paper, I was going to recommend another extension (for time series) to allow things like moving averages...imagine my surprise when I find exactly the example I was going to suggest :) |
When I first started reading the paper, I was going to recommend another extension (for time series) to allow things like moving averages...imagine my surprise when I find exactly the example I was going to suggest :) |
||

[[User:Falcon|Falcon]] 17:10, 19 June 2007 (UTC) |
[[User:Falcon|Falcon]] 17:10, 19 June 2007 (UTC) |
||

+ | |||

+ | Maybe I didn't get some part of the paper, but is it really necessary to have 'order by' or 'group by' as syntax extensions? Isn't it possible to allow developers to use any function as long it matches the types of 'order' or 'group'? |
||

+ | [[User:Falcon|Falcon]] 18:53, 20 June 2007 (UTC) |

## Revision as of 18:53, 20 June 2007

## Contents |

# 1 Talk page for "Comprehensive Comprehensions"

This is a discussion page for the paper Comprehensive Comprehensions.

If you are kind enough to read this paper, you may like to jot down any thoughts it triggers off, and see what others have written. This talk-page lets you do just that.

You can identify your entries by preceding them with four tildes. Doing so adds your name, and the date. Thus:

- Simonpj 08:42, 19 April 2007 (UTC) Note from Simon

If you say who you are in this way, we'll be able to acknowledge your help in a revised version of the paper.

## 1.1 MichaelAdams 14:51, 19 June 2007 (UTC)

In theory these operators (order-by and group-by) should generalize to monads and once you do several other design options open up. These two operators could even be unified into a single operator.

Starting with sort-by, I think the monadic version is fairly obvious. Take something like the following code.

do a <- ma ... b <- mb c <- mc sort by (b, c) using foo d <- md ... return (a, b, c, d)

It would de-sugar to:

((do a <- ma ... b <- mb c <- mc return ((b, c), (a, b, c)) ) `foo` fst) >>= \result -> do let (a, _, _) = result (_, b, _) = result (_, _, c) = result d <- md ... return (a, b, c, d)

Where we have:

foo :: forall a. (a -> t) -> m a -> m a

#### 1.1.1 Generalizing Order-by to Group-by

In fact after a few tweaks it turns out that order-by and group-by could operate under the exact same de-sugaring.

Suppose we let the type of foo be:

foo :: (Functor n) => forall a. (a -> t) -> m a -> m (n a)

Notice that I said "m (n a)" and not "m (m a)". The group-by de-sugaring that I'm going to show works for any Functor "n", and if we imagine "n" to be some type like the following then order-by is just a special case of group-by.

type Ident a = a

(It wouldn't actually be valid Haskell code to use a type synonym in that way though, but it conveys the idea.)

The de-sugaring to support this would take something like the following (inspired by arrow syntax).

do a <- ma b <- mb c <- mc foo args $-< (a, b) -- group by (a, b) using (foo args) g <- mg return (a, b, c, d, e, f, g)

It would produce:

(ma >>= \a -> mb >>= \b -> mc >>= \c -> return ((a, b), (a, b, c))) `foo` args >>= \result -> let a = fmap (\(_, (a, _, _)) -> a) result b = fmap (\(_, (_, b, _)) -> b) result c = fmap (\(_, (_, _, c)) -> c) result in md >>= \d -> return (a, b, c, d)

(Note that after the "foo", the types of "a", "b" and "c" have changed. Their types before "foo" get wrapped by "n" after "foo".)

#### 1.1.2 Even more generalization

Once we have done this, another possibility opens up. Notice that the first of each pair in "result" was never used except from within "foo". It doesn't do any work in the result. Now suppose we change the type of foo to:

foo :: (Functor n) => forall a. m (t, a) -> m (n (u, a))

Now "foo" can not only read existing bindings from "t", but it can also create new bindings with "u". (This hard wires the extraction function to always be "fst", but it simplifies the presentation a bit. The other form with a general extraction could still be used.)

The previous syntax could then be extended to something like:

do a <- ma b <- mb c <- mc (d, e, f) <-$ foo args $-< (a, b) g <- mg return (a, b, c, d, e, f, g)

This would then de-sugar to:

(ma >>= \a -> mb >>= \b -> mc >>= \c -> return ((a, b), (a, b, c))) `foo` args >>= \result -> let d = fmap (\((d, _, _), (_, _, _)) -> d) result e = fmap (\((_, e, _), (_, _, _)) -> e) result f = fmap (\((_, _, f), (_, _, _)) -> f) result a = fmap (\((_, _, _), (a, _, _)) -> a) result b = fmap (\((_, _, _), (_, b, _)) -> b) result c = fmap (\((_, _, _), (_, _, c)) -> c) result in mg >>= \g -> return (a, b, c, d, e, f, g)

#### 1.1.3 Conclusion

The above might have gone too far down the generalization road and put to many bells and whistles on the thing, so it may be worth trimming it down. I also haven't given any thought to what applications would need this. I just wanted to consider how far these operations

Just a few little things:

- Page 2, "sortBy is part of the Haskell Prelude" - it's actually in the List module. (I just spotted you've got the same thing in SYB with Class, 7.1)
- The Down trick is very neat, perhaps that should be a part of the standard libraries.
- Page 5, MSFT is in 'quotes', but should be in "quotes".

Your syntax requires four new keywords, at least one of which is already a standard function (group). Plus with the knowledge of the keywords the parse tree is entirely different. From your paper:

order by x >= y using takeWhile

At first reading I parsed >= as the root node, since in Haskell that would be the way it works. Your 'then' syntax in 6.1 seems preferable as it doesn't take any additional keywords.

--Neil Mitchell 16:04, 19 June 2007 (UTC)

The paper also mentions a function "the." I wasn't able to find this function through hoogle or ":t the" in ghci. Perhaps you could add a one line description the way "nub" is described.

- see section 3.4, its a custom function complete with implementation --Neil Mitchell 23:28, 19 June 2007 (UTC)

- Thanks Neil, I missed it.

As a non-expert, the sense I get is that whenever I see "by" in a list comprehension, I should expect functions or expressions that operate on lists and not on individual elements of lists.

When I first started reading the paper, I was going to recommend another extension (for time series) to allow things like moving averages...imagine my surprise when I find exactly the example I was going to suggest :) Falcon 17:10, 19 June 2007 (UTC)

Maybe I didn't get some part of the paper, but is it really necessary to have 'order by' or 'group by' as syntax extensions? Isn't it possible to allow developers to use any function as long it matches the types of 'order' or 'group'? Falcon 18:53, 20 June 2007 (UTC)