Personal tools

Non-empty list

From HaskellWiki

(Difference between revisions)
Jump to: navigation, search
(advocacy)
(Packages)
(8 intermediate revisions by 7 users not shown)
Line 1: Line 1:
It is possible to assert statically that lists are non-empty using proper [http://www.haskell.org/pipermail/haskell-cafe/2006-November/019593.html type design].
+
Errors such as taking <hask>head</hask> or <hask>tail</hask> of the empty list in Haskell are equivalent to the dereferencing of the zero pointer in C/C++ or <code>NullPointerException</code> in Java. These errors occur because the true domain of the function is smaller than the function's type suggests. For example, the type of <hask>head</hask> says that the function applies to any list. In reality, it can only be meaningfully applied to non-empty lists. One can eliminate such errors by giving functions <hask>head</hask> and <hask>tail</hask> more precise type, such as <hask>FullList a</hask>. Languages like [http://en.wikipedia.org/wiki/Cyclone_programming_language Cyclone] and [http://en.wikipedia.org/wiki/C%CF%89 Cw] do exactly
  +
that.
   
Maybe, you are also interested in [http://www.haskell.org/pipermail/haskell-cafe/2006-November/019644.html advocacy] of this style.
+
It must be emphasized that we can eliminate head-of-empty-list errors
  +
'''now''', without any modification to the Haskell type system, without
  +
developing any new tool. In fact, it is possible in Haskell98! The
  +
same technique applies to OCaml and even Java and C++. The ''only''
  +
required advancement is in our thinking and programming style.
  +
  +
Maybe, you are also interested in
  +
[http://www.haskell.org/pipermail/haskell-cafe/2006-November/019644.html advocacy] of this style.
  +
  +
  +
== Safe list functions ==
  +
  +
Here's the 0th approximation of the advocated approach:
  +
  +
<haskell>
  +
{-# Haskell98! #-}
  +
-- Safe list functions
  +
  +
module NList (FullList,
  +
fromFL,
  +
indeedFL,
  +
decon,
  +
head,
  +
tail,
  +
Listable (..)
  +
) where
  +
  +
import Prelude hiding (head, tail)
  +
  +
newtype FullList a = FullList [a] -- data constructor is not exported!
  +
  +
fromFL (FullList x) = x -- Injection into general lists
  +
  +
-- The following is an analogue of `maybe'
  +
indeedFL :: [a] -> w -> (FullList a -> w) -> w
  +
indeedFL x on_empty on_full
  +
| null x = on_empty
  +
| otherwise = on_full $ FullList x
  +
  +
-- A possible alternative, with an extra Maybe tagging
  +
-- indeedFL :: [a] -> Maybe (FullList a)
  +
  +
-- A more direct analogue of `maybe', for lists
  +
decon :: [a] -> w -> (a -> [a] -> w) -> w
  +
decon [] on_empty on_full = on_empty
  +
decon (h:t) on_empty on_full = on_full h t
  +
  +
  +
-- The following are _total_ functions
  +
-- They are guaranteed to be safe, and so we could have used
  +
-- unsafeHead# and unsafeTail# if GHC provides though...
  +
  +
head :: FullList a -> a
  +
head (FullList (x:_)) = x
  +
  +
tail :: FullList a -> [a]
  +
tail (FullList (_:x)) = x
  +
  +
-- Mapping over a non-empty list gives a non-empty list
  +
instance Functor FullList where
  +
fmap f (FullList x) = FullList $ map f x
  +
  +
  +
-- Adding something to a general list surely gives a non-empty list
  +
infixr 5 !:
  +
  +
class Listable l where
  +
(!:) :: a -> l a -> FullList a
  +
  +
instance Listable [] where
  +
(!:) h t = FullList (h:t)
  +
  +
instance Listable FullList where
  +
(!:) h (FullList t) = FullList (h:t)
  +
</haskell>
  +
  +
  +
Now we can write
  +
<haskell>
  +
import NList
  +
import Prelude hiding (head, tail)
  +
safe_reverse l = loop l []
  +
where
  +
loop l accum = indeedFL l accum $
  +
(\l -> loop (tail l) (head l : accum))
  +
  +
test1 = safe_reverse [1,2,3]
  +
</haskell>
  +
  +
As we can see, the null test is algorithmic. After we've done it, head
  +
and tail no longer need to check for null list. Those head and tail
  +
functions are total. Thus we achieve both safety and performance.
  +
  +
We can also write
  +
<haskell>
  +
-- Again, we are statically assured of no head [] error!
  +
test2 = head $ 1 !: 2 !: 3 !: []
  +
</haskell>
  +
  +
I should point to
  +
[http://pobox.com/~oleg/ftp/Computation/lightweight-dependent-typing.html Lightweight dependent typing] for justification and formalization, as
  +
well as for for further, more complex examples.
  +
We can also use the approach to
  +
ensure various control properties, e.g., the yield property: a thread may
  +
not invoke `yield' while holding a lock. We can assure this property
  +
both for recursive and non-recursive locks.
  +
  +
If there is a surprise in this, it is in the triviality of
  +
approach. One can't help but wonder why don't we program in this
  +
style.
  +
  +
== Integrating with the existing list-processing functions ==
  +
  +
Jan-Willem Maessen wrote:
  +
<blockquote>
  +
In addition, we have this rather nice assembly of functions which
  +
work on ordinary lists. Sadly, rewriting them all to also work on
  +
NonEmptyList or MySpecialInvariantList is a nontrivial task.
  +
</blockquote>
  +
  +
That's an excellent question. Indeed, let us assume we have a function
  +
<haskell>
  +
foo:: [a] -> [a]
  +
</haskell>
  +
(whose code, if available, we'd rather not change) and we want to
  +
write something like
  +
<haskell>
  +
\l -> [head l, head (foo l)]
  +
</haskell>
  +
To use the safe <hask>head</hask> from NList.hs , we should write
  +
<haskell>
  +
\l -> indeedFL l onempty (\l -> [head l, head (foo l)])
  +
</haskell>
  +
But that doesn't type: first of all, <hask>foo</hask> applies to
  +
<hask>[a]</hask> rather than
  +
<hask>FullList a</hask>, and second, the result of
  +
<hask>foo</hask> is not <hask>FullList a</hask>, required
  +
by our <hask>head</hask>. The first problem is easy to solve: we can always
  +
inject <hask>FullList a</hask> into the general list:
  +
<hask>fromFL</hask>. We insist on writing
  +
the latter function explicitly, which keeps the typesystem simple,
  +
free of subtyping and implicit coercions. One may regard
  +
<hask>fromFL</hask> as an
  +
analogue of <hask>fromIntegral</hask> -- which, too, we have to
  +
write explicitly, in any code with more than one sort of integral
  +
numbers (e.g., Int and Integer, or Int and CInt).
  +
  +
If we are not sure if our function foo maps non-empty lists
  +
to non-empty lists, we really should handle the empty list case:
  +
<haskell>
  +
\l -> indeedFL l onempty $
  +
\l -> [head l, indeedFL (foo $ fromFL l) onempty' head]
  +
</haskell>
  +
If we have a hunch that foo maps non-empty lists to non-empty lists,
  +
but we are too busy to verify it, we can write
  +
<haskell>
  +
\l -> indeedFL l onempty $
  +
\l -> [head l, indeedFL (foo $ fromFL l)
  +
(error msg)
  +
head]
  +
where msg = "I'm quite sure foo maps non-empty lists to " ++
  +
"non-empty lists. I'll be darned if it doesn't."
  +
</haskell>
  +
That would get the code running. Possibly at some future date (during
  +
the code review?) I'll be called to justify my hunch, to whatever
  +
degree of formality (informal argument, formal proof) required by the
  +
policies in effect. If I fail at this justification, I'd better think
  +
what to do if the result of foo is really the empty list. If I
  +
succeed, I'd be given permission to update the module NList with the
  +
following definition
  +
<haskell>
  +
nfoo (FullList x) = FullList $ foo x
  +
</haskell>
  +
after which I could write
  +
<haskell>
  +
\l -> indeedFL l onempty (\l -> [head l, head (nfoo l)])
  +
</haskell>
  +
with no extra empty list checks.
  +
  +
Excerpted from the discussion on Haskell-Cafe, November 2006.
  +
  +
  +
== Reliable and simple approach ==
  +
  +
The above approach requires a trusted kernel of functions
  +
that assert the non-emptiness of lists.
  +
However, when implementing the kernel, and even more when you extend or alter it,
  +
you may make mistakes.
  +
The best approach to avoid even this,
  +
is to make the non-emptiness explicit in the type from the beginning:
  +
<haskell>
  +
type NonEmptyList a = (a, [a])
  +
</haskell>
  +
The first member of the pair represents the first element in the non-empty list.
  +
Now the compiler can test non-emptiness for you and you cannot cheat anymore.
  +
  +
This approach is extended in the {{HackagePackage|id=non-empty}} package
  +
such that you can add more leading elements.
  +
That is, you can define lists with at least two or three or more elements.
  +
You can even define lists that allow for a given set of admissible list lengths.
  +
A unified interface to all these variants is provided by standard type classes
  +
like Functor, Foldable, Traversable and some custom classes.
  +
All that is achieved with Haskell 98!
  +
  +
  +
== Packages ==
  +
  +
These packages implement non-empty lists:
  +
* {{HackagePackage|id=NonEmpty}}
  +
* {{HackagePackage|id=NonEmptyList}}
  +
* {{HackagePackage|id=Cardinality}}
  +
* {{HackagePackage|id=non-empty}}
  +
* {{HackagePackage|id=semigroups}}
   
 
[[Category:Idioms]]
 
[[Category:Idioms]]

Revision as of 19:42, 1 February 2014

Errors such as taking
head
or
tail
of the empty list in Haskell are equivalent to the dereferencing of the zero pointer in C/C++ or NullPointerException in Java. These errors occur because the true domain of the function is smaller than the function's type suggests. For example, the type of
head
says that the function applies to any list. In reality, it can only be meaningfully applied to non-empty lists. One can eliminate such errors by giving functions
head
and
tail
more precise type, such as
FullList a
. Languages like Cyclone and Cw do exactly

that.

It must be emphasized that we can eliminate head-of-empty-list errors now, without any modification to the Haskell type system, without developing any new tool. In fact, it is possible in Haskell98! The same technique applies to OCaml and even Java and C++. The only required advancement is in our thinking and programming style.

Maybe, you are also interested in advocacy of this style.


Contents

1 Safe list functions

Here's the 0th approximation of the advocated approach:

{-# Haskell98! #-}
-- Safe list functions
 
module NList (FullList,
              fromFL,
              indeedFL,
              decon,
              head,
              tail,
              Listable (..)
              ) where
 
import Prelude hiding (head, tail)
 
newtype FullList a = FullList [a]  -- data constructor is not exported!
 
fromFL (FullList x) = x                 -- Injection into general lists
 
-- The following is an analogue of `maybe'
indeedFL :: [a] -> w -> (FullList a -> w) -> w
indeedFL x on_empty on_full 
    | null x = on_empty
    | otherwise = on_full $ FullList x
 
-- A possible alternative, with an extra Maybe tagging
-- indeedFL :: [a] -> Maybe (FullList a)
 
-- A more direct analogue of `maybe', for lists
decon :: [a] -> w -> (a -> [a] -> w) -> w
decon []    on_empty on_full = on_empty
decon (h:t) on_empty on_full = on_full h t
 
 
-- The following are _total_ functions
-- They are guaranteed to be safe, and so we could have used
-- unsafeHead# and unsafeTail# if GHC provides though...
 
head :: FullList a -> a
head (FullList (x:_)) = x
 
tail :: FullList a -> [a]
tail (FullList (_:x)) = x
 
-- Mapping over a non-empty list gives a non-empty list
instance Functor FullList where
    fmap f (FullList x) = FullList $ map f x
 
 
-- Adding something to a general list surely gives a non-empty list
infixr 5 !:
 
class Listable l where
    (!:) :: a -> l a -> FullList a
 
instance Listable [] where
    (!:) h t = FullList (h:t)
 
instance Listable FullList where
    (!:) h (FullList t) = FullList (h:t)


Now we can write

import NList
import Prelude hiding (head, tail)
safe_reverse l = loop l [] 
    where
        loop l accum = indeedFL l accum $
                       (\l -> loop (tail l) (head l : accum))
 
test1 = safe_reverse [1,2,3]

As we can see, the null test is algorithmic. After we've done it, head and tail no longer need to check for null list. Those head and tail functions are total. Thus we achieve both safety and performance.

We can also write

-- Again, we are statically assured of no head [] error!
test2 = head $ 1 !: 2 !: 3 !: []

I should point to Lightweight dependent typing for justification and formalization, as well as for for further, more complex examples. We can also use the approach to ensure various control properties, e.g., the yield property: a thread may not invoke `yield' while holding a lock. We can assure this property both for recursive and non-recursive locks.

If there is a surprise in this, it is in the triviality of approach. One can't help but wonder why don't we program in this style.

2 Integrating with the existing list-processing functions

Jan-Willem Maessen wrote:

In addition, we have this rather nice assembly of functions which work on ordinary lists. Sadly, rewriting them all to also work on NonEmptyList or MySpecialInvariantList is a nontrivial task.

That's an excellent question. Indeed, let us assume we have a function

	foo:: [a] -> [a]

(whose code, if available, we'd rather not change) and we want to write something like

	\l -> [head l, head (foo l)]
To use the safe
head
from NList.hs , we should write
	\l -> indeedFL l onempty (\l -> [head l, head (foo l)])
But that doesn't type: first of all,
foo
applies to
[a]
rather than
FullList a
, and second, the result of
foo
is not
FullList a
, required by our
head
. The first problem is easy to solve: we can always inject
FullList a
into the general list:
fromFL
. We insist on writing

the latter function explicitly, which keeps the typesystem simple, free of subtyping and implicit coercions. One may regard

fromFL
as an analogue of
fromIntegral
-- which, too, we have to

write explicitly, in any code with more than one sort of integral numbers (e.g., Int and Integer, or Int and CInt).

If we are not sure if our function foo maps non-empty lists to non-empty lists, we really should handle the empty list case:

	\l -> indeedFL l onempty $
	       \l -> [head l, indeedFL (foo $ fromFL l) onempty' head]

If we have a hunch that foo maps non-empty lists to non-empty lists, but we are too busy to verify it, we can write

	\l -> indeedFL l onempty $
	       \l -> [head l, indeedFL (foo $ fromFL l) 
				(error msg)
			         head]
	  where msg = "I'm quite sure foo maps non-empty lists to " ++
	              "non-empty lists. I'll be darned if it doesn't."

That would get the code running. Possibly at some future date (during the code review?) I'll be called to justify my hunch, to whatever degree of formality (informal argument, formal proof) required by the policies in effect. If I fail at this justification, I'd better think what to do if the result of foo is really the empty list. If I succeed, I'd be given permission to update the module NList with the following definition

	nfoo (FullList x) = FullList $ foo x

after which I could write

	\l -> indeedFL l onempty (\l -> [head l, head (nfoo l)])

with no extra empty list checks.

Excerpted from the discussion on Haskell-Cafe, November 2006.


3 Reliable and simple approach

The above approach requires a trusted kernel of functions that assert the non-emptiness of lists. However, when implementing the kernel, and even more when you extend or alter it, you may make mistakes. The best approach to avoid even this, is to make the non-emptiness explicit in the type from the beginning:

type NonEmptyList a = (a, [a])

The first member of the pair represents the first element in the non-empty list. Now the compiler can test non-emptiness for you and you cannot cheat anymore.

This approach is extended in the non-empty package such that you can add more leading elements. That is, you can define lists with at least two or three or more elements. You can even define lists that allow for a given set of admissible list lengths. A unified interface to all these variants is provided by standard type classes like Functor, Foldable, Traversable and some custom classes. All that is achieved with Haskell 98!


4 Packages

These packages implement non-empty lists: