DDC/EffectSystem

From HaskellWiki
< DDC
Revision as of 02:28, 19 March 2008 by Benl23 (talk | contribs)
Jump to navigation Jump to search

Effect typing vs State monads

Instead of state monads (like IO), Disciple uses default strict evaluation and effect typing to deal with computational effects. Effect typing helps us write less code because the types of effectful functions have the same shape as their pure counterparts.

For example, in Haskell the pure map function has type:

    map :: (a -> b) -> [a] -> [b]

but if we need to pass an effectful function defined in terms of a state monad, we must use the monadic version, mapM instead.

    mapM :: Monad m => (a -> m b) -> [a] -> m [b]

This puts us in a situation where we really need two copies of every higher-order function, one pure and one monadic - that's a lot of duplication! This is even more of a problem when a library (like Data.Map) exports a pure version but not a monadic one. In this case we either have to refactor our code to be less monadic, or spend time writing a function that should really be in the library.

With effect typing, the fact that function might cause an effect is orthogonal to the shape of its type. In Haskell, because putStr prints to the screen it has the IO constructor in its return type:

    putStr :: String -> IO ()

In Disciple, this fact is encoded as an effect on the function arrow instead:

    putStr :: String -(!e1)> ()
           :- !e1 = !Console

In this type, !e1 is an effect variable, much like a is a type variable. !Console is an effect constructor and it tells us what else happens when putStr is called, besides evaluating the return value.

Because this second version of putStr doesn't have the monad in its type, we can just use the plain version of map and not have to worry about mapM.

    
    main () 
     = do   map putStr ["one", "two", "three"]   -- print 'onetwothree'
            ...

Extended type information

But wait, that's evil isn't it? From the previous example, it might look as though we've lost the information that putStr has an effect and descended into the quagmire of untracked side effects and imperative programming. This would be true if map had only the simple type as it does in Haskell, because there'd be no way to track the fact that this application of map also causes a !Console effect like its argument does.

In Disciple, the complete type of map map is:

    map  :: forall a b %r0 %r1 !e0 $c0
         .  (a -(!e0 $c0)> b) -> List %r0 a -(!e1 $c1)> List %r1 b
         :- !e1        = !{ !Read %r0; !e0 }
         ,  $c1        = f : $c0

Which is a lot more information!

Ignoring the closure information starting with $ for now, we'll say briefly that the variables starting with % are region variables - so (List %r0 a) is a list in a region of memory called %r0 and has elements of type a.

If we restrict the type of map to just type, region and effect information then we have:

    map :: forall a b %r0 %r1 !e0
        .  (a -(!e0)> b) -> List %r0 a -(!e1)> List %r1 b
        :- !e1         = !{ !Read %r0; !e0 }

We see now that the effect sum !{!Read %r0; !e0} records the fact that the map function, when applied to both its arguments, has the effect of reading the input list as well as doing whatever the argument function does.

The Disciplined Disciple Compiler (DDC) uses this effect discipline to track which parts of the program cause computational effects and which are pure. This allows it to check the type safety of programs and perform the same sort of code transformation optimizations as done by GHC.

Behind the scenes

None of this would be useful if you actually had to write these extended types in regular programs. It's a good thing then that, for the most part, you can you can ignore the extended region, effect and closure information when writing code.

The definition of map in Disciple is just the same as in Haskell, including the type signature.

   map :: (a -> b) -> [a] -> [b]
   map f []        = []
   map f (x : xs)  = f x : map f xs

Because all the extra information is orthogonal to the main shape of the type, ie its argument and return data types, the compiler can fill this all in for you behind the scenes. The extra information is present in the interface files and core language, but you'll only have to worry about it if you import foreign functions from C land or start to mix laziness with destructive update in the same program.