[Haskell-beginners] Re: folds again -- myCycle

Tue Mar 17 07:28:35 EDT 2009

Am Dienstag, 17. März 2009 02:24 schrieb 7stud:
> Daniel Fischer <daniel.is.fischer <at> web.de> writes:
> > Let us rewrite the definition a little (looking only at the case for
> > nonempty lists):
> >
> > Variant 1:
> > myCycle xs = foldr leftAdd [] [1 .. ]
> >    where
> >       leftAdd _ acc = xs ++ acc
> >
> > foldr leftAdd [] [1 .. ]
> >    ~> leftAdd 1 (foldr leftAdd [] [2 .. ])
> >    ~> xs ++ (foldr leftAdd [] [2 .. ])
> >    ~> xs ++ (leftAdd 2 (foldr leftAdd [] [3 .. ]))
> >    ~> xs ++ (xs ++ (foldr leftAdd [] [3 .. ]))
> >    ~> xs ++ (xs ++ (xs ++ (xs ++ ...)))
> >
> > Variant 2:
> > myCycle xs = foldr rightAdd [] [1 .. ]
> >    where
> >       rightAdd _ acc = acc ++ xs
> >
> > foldr rightAdd [] [1 .. ]
> >    ~> rightAdd 1 (foldr rightAdd [] [2 .. ])
> >    ~> (foldr rightAdd [] [2 .. ]) ++ xs
> >    ~> (rightAdd 2 (foldr rightAdd [] [3 .. ])) ++ xs
> >    ~> ((foldr rightAdd [] [3 .. ]) ++ xs) ++ xs
> >    ~> (((... ++ xs) ++ xs) ++ xs
>
> Thanks for the explanation.  After reading your post, I practiced writing
> a few foldr's out by hand.  It really helped me to understand what foldr
> was doing.  The part that initially confused me was the step in the third
>
> line:
> > foldr rightAdd [] [1 .. ]
> >    ~> rightAdd 1 (foldr rightAdd [] [2 .. ])
> >    ~> (foldr rightAdd [] [2 .. ]) ++ xs
>

Sorry. I was too lazy to write out the intermediate step. However, since that 
resulted in you doing it manually and so getting more familiar with foldr, it 
had a good effect :)

> In case any other beginner reads this thread and is confused by
> that, here's what's going on.  In RWH foldr is defined on p. 94
> like this:
>
> foldr step zero (x:xs) = step x (foldr step zero xs)
>
> Daniel Fischer's example is calling foldr like this:
>
> foldr rightAdd [] [1 .. ]
>
> matching the foldr definition to that foldr function call:
>
> foldr step zero (x:xs)
> foldr rightAdd [] [1 .. ]
>
> you can see that step is the function rightAdd, zero's value is [],
> x = 1, and xs = [2..].  Therefore, from the definition of foldr:
>
> foldr rightAdd [] [1..] = rightAdd 1 (foldr rightAdd [] [2..])
>
> The right hand side of that equation is a function call to
> rightAdd.  It specifies two arguments: the first argument
> is 1, and the second argument is the expression in the
>
> parentheses.  Now, look at the definition of rightAdd:
> > rightAdd _ acc = acc ++ xs
>
> rightAdd is a function that ignores its first argument,
> then concatenates xs to the right side of its second
> argument.  Applying that to this function call:
>
> rightAdd 1 (foldr rightAdd [] [2..])
>
> rightAdd ignores the first argument, the 1, and
> concatenates xs to the right side of its second
> argument, which in this case is the expression
> in parentheses, so you get:
>
> (foldr rightAdd [] [2..]) ++ xs
>
> Therefore:
>
> rightAdd 1 (foldr rightAdd [] [2..]) =
> (foldr rightAdd [] [2..]) ++ xs
>
> Then it's just a matter of expanding the expression
> in the parentheses a few more times until you understand
> what the pattern is.
>
> > Let us rewrite it a little more.
> > Obviously, it doesn't matter which list is passed into
> > foldr leftAdd []
> > , as long as it's infinite. So instead of passing [1 .. ], let us pass a
> > simpler infinite list, say
> > ones = 1:ones
> > (or ones = repeat 1).
> > Then the evaluation becomes
> >
> > foldr leftAdd [] ones
> >    ~> foldr leftAdd [] (1:ones)
> >    ~> leftAdd 1 (foldr leftAdd [] ones)
> >    ~> xs ++ (foldr leftAdd [] ones)
> >
> > foldr rightAdd [] ones
> >    ~> foldr rightAdd [] (1:ones)
> >    ~> rightAdd 1 (foldr rightAdd [] ones)
> >    ~> (foldr rightAdd [] ones) ++ xs
> >
> > So variant 1 amounts to
> > myCycle xs = let ys = xs ++ ys in ys
> > and variant 2 to
> > myCycle2 xs = let ys = ys ++ xs in ys
> >
> > Now it should be easy to see why the first works, but not the second.
>
> No.  I could tell from the example at the beginning of your post
> that there was an infinite expression on the front of the result list, but
> I can't see that in those equations. What am I failing to recognize there?

Ah, well. I again underestimated how much experience one needs to see it 
immediately, sorry once more.

The right hand side of myCycle2's definition (if we specialise xs to, say, 
[1,2,3]) is basically the same as a definition

ys = ys ++ [1,2,3]

at top level.
Now to find out what ys is, we look at the right hand side of its definition, 
ys ++ [1,2,3]
, good, so we have a concatenation of someList with [1,2,3], the first part of 
ys is then someList. To determine ys, we must therefore first determine 
someList. So, what is someList? someList = ys, so to determine the first part 
of ys, we must find the whole of ys first -- oops, infinite recursion.

>
> > And from the rewriting, we can read off another representation of
> > cycle  a  fold.
> > We have, for all lists ks, ms:
> > ks ++ ms = foldr (:) ms ks
> >
> > So we can write variant 1 as the snappy
> >
> > myCycle xs = let ys = foldr (:) ys xs in ys
>
> I can't understand that one.  I'll have to write it out:
>
> definition of foldr so I can refer to it:
> foldr step zero (x:xs) = step x (foldr step zero xs)
>
> myCycle2 [1, 2] =>
> ys = foldr (:) ys [1, 2]
> ---------------
> Here's the left hand side of the foldr definition:
> foldr step zero (x:xs)  =>match these parameters to the args in the call:
> foldr   (:)    ys  [1, 2] =
>
> Now writing out what's on the right hand side of the equals sign in the
> foldr definition using the args from the line above:
> = (:) 1 (foldr (:) ys 2:[])
> = 1:(foldr (:) ys 2:[])
> Expand the term in the parentheses:
> = 1:( (:) 2 (foldr (:) ys []) )
> = 1:( 2:(foldr (:) ys []) )
> = 1:2:(foldr (:) ys [])
> = 1:2:(ys)
> = 1:2:ys
>
> But ys is defined to be:
>
> foldr (:) ys [1, 2]

Actually, the implementation shouldn't need to go back to that definition at 
this stage, it should now have reduced the thunk to
ys = 1:2:ys
, hopefully even by constructing a true cyclic list, making the ys on the 
right hand side a pointer to the front of the list.
>
> So substituting into the last line:
>
> = 1:2:(foldr (:) ys [1, 2])
> = 1:2:( (:) 1 (foldr (:) ys 2:[])
> = 1:2:(1:(foldr (:) ys 2:[])
> = 1:2:1:( (:) 2 (foldr (:) ys [])
> = 1:2:1:( 2:(foldr (:) ys [])
> = 1:2:1:2:(foldr (:) ys [])
> = 1:2:1:2:(ys)
> = 1:2:1:2:ys
>
> But ys is defined to be:
>
> foldr (:) ys [1, 2]
>
> Therefore, that process goes on and on ad infinitum.  Pretty slick.
>
> I guess haskell thunks the whole infinite expression, therefore
> the let expression:
>
> let ys = ....
> in ys
>
> doesn't cause an infinite loop in the ys = ... line, which means
> the second line executes, which means ys is returned as the
> result of the function call myCycle2 [1,2]?
>
>

Yes, it's thunked. It works because we can now determine the start of ys 
without knowing anything about ys (provided that xs is nonempty). The result 
of myCycle2 [1,2] is the thunk that says how to evaluate it. We give a name 
to the result to be able to reference it in the calculation.