Personal tools

Global keys

From HaskellWiki

(Difference between revisions)
Jump to: navigation, search
(categorise)
 
(2 intermediate revisions by 2 users not shown)
Line 9: Line 9:
 
As Brian Hulley put it, at the moment, there is a strange unnatural discrepancy between the fixed set of built-in privileged operations such as Data.Unique.newUnique which are "allowed" to make use of global state, and user defined operations which have to rely on a shaky hack in order to preserve natural abstraction barriers between components such as a user-defined Unique, Atom, and anything involving memoisation or device management etc.
 
As Brian Hulley put it, at the moment, there is a strange unnatural discrepancy between the fixed set of built-in privileged operations such as Data.Unique.newUnique which are "allowed" to make use of global state, and user defined operations which have to rely on a shaky hack in order to preserve natural abstraction barriers between components such as a user-defined Unique, Atom, and anything involving memoisation or device management etc.
   
The kind of applications we have in mind are:
+
The kind of applications we have in mind (please add more) are:
 
* A source of random numbers, or of unique numbers. This should be on a per-thread basis.
 
* A source of random numbers, or of unique numbers. This should be on a per-thread basis.
* The value of 'stdin' or 'stdout'. We don't want to mutate this, but we might want to set the value for sub-computations, including any spawned threads.
+
* The value of 'stdin' or 'stdout'. We don't want to mutate this (although note that the handle itself <em>contains</em> mutable state), but we might want to set the value for sub-computations, including any spawned threads.
   
 
== A straw man ==
 
== A straw man ==
   
Here's a straw-man proposal.
+
The basic idea is this:
* New top-level declaration to allocate a new <strong>key</strong>
+
* Each thread has access (via the IO or STM monad) to a "dictionary" that maps a (typed) key to a value of that type.
  +
* The dictionary is not mutable, but the values might themselves be mutable cells (think of 'stdin').
  +
  +
(I don't know how to put code inside a bullet, and carry on the bullet afterwards, so the formatting below is a mess.)
  +
  +
More concretely:
  +
* A new built-in data type, <code>Key a</code>, an instance of Eq, but not Ord. It's a "thing with identity" (TWI).
  +
* A new built-in data type of dictionaries, <code>Dict</code>, which maps a typed key to a typed value:
 
<haskell>
 
<haskell>
key rng :: Key StdGen
+
lookup :: Dict -> Key a -> Maybe a
  +
insert :: Dict -> Key a -> a -> Dict
  +
union :: Dict -> Dict -> Dict
  +
etc
 
</haskell>
 
</haskell>
A key is globally unique in a program. It supports equality but not ordering.
+
Yes, it'll need a massive interface, like Data.Map.
* You can read and write global state and local state:
+
* You can allocate a fresh (globally unique) key in the IO monad
 
<haskell>
 
<haskell>
readLS, readGS :: Key a -> IO a
+
newKey :: IO (Key a)
writeLS, writeGS :: Key a -> a -> IO a
+
</haskell>
</haskell> The distinction between the two is that there is one global state, shared between all threads, but each thread has its own local state. One way to think of it would be to imagine that each thread has its own finite mapping from keys to values.
+
* However, we add a new top-level declaration to allocate new top-level key:
* We probably want to provide efficient finite maps on keys.
+
<haskell>
* Keys are useful independent of mutability. For example, GHC has lots of code going
+
key rng :: Key StdGen
  +
</haskell>
  +
This is useful independent of mutability. For example, GHC has lots of code going
 
<haskell>
 
<haskell>
 
thenIdKey = mkPreludeIdKey 3
 
thenIdKey = mkPreludeIdKey 3
Line 33: Line 33:
 
</haskell>
 
</haskell>
 
etc, where we hand out unique identifiers by hand. Easy to screw up.
 
etc, where we hand out unique identifiers by hand. Easy to screw up.
  +
* Each thread has an implicit, thread-specific dictionary:
  +
<haskell>
  +
getDict :: IO Dict
  +
setDict :: Dict -> IO a -> IO a
  +
</haskell>
  +
(Think of myThreadId.) Notice that setDict does not mutate the dictionary;
  +
it just sets the implicit dictionary for a nested sub-computation.
  +
* A forked thread inherits its parent's dictionary.
  +
  +
== Initialisation ==
  +
  +
Initialisation is the big question here.
  +
A library may want to allocate a key, and an initialiser to be run the first time the key is accessed, without the client of the library needing to know about it at all. A second issue that we may sometimes want to implicitly (?) re-initialise (or clear) all or part of the dictionary when forking a thread.
   
Open questions:
+
[[Category:Idioms]]
* When you forkIO a thread, what (if anything) does the forked thread inherit?
 

Latest revision as of 10:02, 17 September 2006

Contents

[edit] 1 Introduction

This page is meant to contain discussion about global state and "things with identity". It's a companion to Adrian's http://www.haskell.org/hawiki/GlobalMutableState, but that's in the old Wiki, so I thought it better to start a new page.

See also Ian Stark's ACIO message http://www.haskell.org/pipermail/haskell-cafe/2004-November/007664.html.

[edit] 2 Examples

As Brian Hulley put it, at the moment, there is a strange unnatural discrepancy between the fixed set of built-in privileged operations such as Data.Unique.newUnique which are "allowed" to make use of global state, and user defined operations which have to rely on a shaky hack in order to preserve natural abstraction barriers between components such as a user-defined Unique, Atom, and anything involving memoisation or device management etc.

The kind of applications we have in mind (please add more) are:

  • A source of random numbers, or of unique numbers. This should be on a per-thread basis.
  • The value of 'stdin' or 'stdout'. We don't want to mutate this (although note that the handle itself contains mutable state), but we might want to set the value for sub-computations, including any spawned threads.

[edit] 3 A straw man

The basic idea is this:

  • Each thread has access (via the IO or STM monad) to a "dictionary" that maps a (typed) key to a value of that type.
  • The dictionary is not mutable, but the values might themselves be mutable cells (think of 'stdin').

(I don't know how to put code inside a bullet, and carry on the bullet afterwards, so the formatting below is a mess.)

More concretely:

  • A new built-in data type, Key a, an instance of Eq, but not Ord. It's a "thing with identity" (TWI).
  • A new built-in data type of dictionaries, Dict, which maps a typed key to a typed value:
  lookup :: Dict -> Key a -> Maybe a
  insert :: Dict -> Key a -> a -> Dict
  union  :: Dict -> Dict -> Dict
  etc

Yes, it'll need a massive interface, like Data.Map.

  • You can allocate a fresh (globally unique) key in the IO monad
  newKey :: IO (Key a)
  • However, we add a new top-level declaration to allocate new top-level key:
  key rng :: Key StdGen

This is useful independent of mutability. For example, GHC has lots of code going

  thenIdKey = mkPreludeIdKey 3
  returnIdName = mkPreludeIdKey 4

etc, where we hand out unique identifiers by hand. Easy to screw up.

  • Each thread has an implicit, thread-specific dictionary:
  getDict :: IO Dict
  setDict :: Dict  -> IO a -> IO a

(Think of myThreadId.) Notice that setDict does not mutate the dictionary; it just sets the implicit dictionary for a nested sub-computation.

  • A forked thread inherits its parent's dictionary.

[edit] 4 Initialisation

Initialisation is the big question here. A library may want to allocate a key, and an initialiser to be run the first time the key is accessed, without the client of the library needing to know about it at all. A second issue that we may sometimes want to implicitly (?) re-initialise (or clear) all or part of the dictionary when forking a thread.