Libraries and hierarchies

Simon Peyton-Jones simonpj@microsoft.com
Tue, 5 Aug 2003 14:18:46 +0100


[Nothing new in this message: just a summary, with some useful
terminology.]

| Interestingly, this is another way that source code can unambiguously
| refer to libraries: by package name, site, and module name (with the
| former two properties being specified out-of-band in some additional
| matter that comes with the source code - perhaps a package
specification
| or similar).  This is rather less ugly than using GUIDs.  We can also
| provide GUIDs, of course, because doing so is cheap.

Suppose someone makes a package called glut-6.0 with modules Graphics,
Graphics.GLUT, and so on.  Suppose Graphics.GLUT includes a function
foo.  Then:

	The "original name" of function foo is
		glut6.0:Graphics.GLUT.foo

	That is <package-name>:<relative-path-within-package>

Grafting the package glut6.0 into the module hierarchy does not change
the original name of anything in the package.  This is important,
because the binaries generated for the package use original names for
cross-module references, and we don't want those to change when we
graft.

All this does is change the need for *module names* to be globally
unique into the requirement for *package names* to be unique.   This
problem is much easier because

	a) There are fewer packages, so some central help-yourself
global registry=20
	is feasible, as Simon M suggests.

	b) Package names can be longish and clunky (e.g. including a
version number),=20
	because we provide a way to avoid mentioning them in source
code.

I personally think that using GUIDs for package names would be a
mistake.  They are essentially opaque to people, so there needs to be
some separate infrastructure to describe what a GUID names, and the
extra layer of indirection seems to buy little.  It's not hard to have
unique package names (I claim) and that'll do the job nicely.  Indeed a
package name, as suggested here, then *is* a globally unique ID, or
GUID, only with a comprehensible name.

The "graft package into tree" mechanism can be seen as a method to
implement (b).  Installing a package means you can name modules in that
package using A.B.C module notation, without explicitly mentioning the
package itself.  As Simon's message above said, source code therefore
only makes sense given some set of graftings (=3D package -> module =
prefix
mapping), and one might want to make that part of the package
(source-code) description.

=20
Ganesh asks

| Would the following be possible under your proposal:
|=20
| module M1.A imports Collections.Foo and only works with version 1
| module M2.B imports Collections.Foo and only works with version 2
| module C imports M1.A and M2.B

Yes, this is ok, as Simon indicated earlier.  Presumably there are two
packages coll-1.0 and coll-2.0.  We graft them in at (say) Collections,
and Old.Collections and away we go.  Or, we invoke the compiler for M1.A
with a command line flag to graft in coll-1.0 at Collections, and then
compile M2.B grafting coll-2.0 at Collections.

Either way, a value constructed by package coll-1.0 will be
type-incompatible with functions in coll-2.0.  The former will have
original names "coll-1.0:Foo.T" while the latter will have
"coll-2.0:Foo.T".  Trying to provide transparent type upgrade is too
hard.


Manuel didn't like the fact that an import could mean "relative or
absolute".  But presumably we want to continue to write little programs
with three modules A, B, Main, and have Main just say 'import A'.  So
presumably the current directory is always implicitly grafted into the
module hierarchy at the root -- and that is all we need for making
internal references within a package work out.

Simon