Libraries and hierarchies

Simon Marlow simonmar@microsoft.com
Tue, 5 Aug 2003 11:12:01 +0100


Hi Alastair,

Thanks for the comments.
=20
> I wonder if the extra complication is really that high.
> Suppose that every module somehow has a unique name=20
> (128 bit hash, global registry, whatever) and that we have
> setup two mappings to the same thing:
>=20
>    Foo.Bar      -> Reid.Consulting.Bar.version27
>    Baz.Bar      -> Reid.Consulting.Bar.version27
>=20
> to implement this in Hugs, it is _almost_ enough to simply
> run all source code through a preprocessor which replaces=20
> occurences of Foo.Bar and Baz.Bar with Reid.Consulting.Bar.version27.

Yes, this is a simpler implementation.

> >(b) needing unique package names,
>=20
> The original proposal tied this feature strongly to packages.=20
>  I wonder if this is really necessary.

Perhaps not - we only use packages as a convenient notion for a
collection of modules within which relative module names can be used,
and as a way to identify the collection.

> Putting that aside, it seems that unique package names are=20
> fairly easy to come by.  Some possibilities are:
>=20
> 1) Use the URI of the primary download site.
>    From choice, the URI would be for a tarfile or whatever
>    but all that really matters is that it is unique.
>    This will typically include a version number but, if not,
>    the user could add one if they care.
>=20
> 2) In the likely event that we have a few big repositories=20
>    and many small ones, simply prefix the name of the package
>    by the repository name or, if not available, name of author.
>    This is effectively the same as (1) but avoids using URIs.
>=20
> 3) Directory name on your system.

We can't use (3) in GHC, because the package name has to be fixed at
compile time, and can't be changed once the binaries have been
generated.

It's an advantage to have short package names, because (at least in GHC)
the package name will be included in every symbol in a compiled library.
URIs are attractive, but long.  It will make debugging really painful. =20

Another scheme would be to use short package names (eg. "gtk-0.15",
"hgl-3.00"), and have a web page (eg. on the Wiki) where everyone can
register package names, with links to the appropriate download sites.
This works well for the Python folks, and I think having short, readable
package names will make the whole scheme more accessible.

---

Some notes on dependencies, which occurred to Simon & I:

An interesting property of this scheme is that if you distribute a
binary library, then you only have to list the package names on which it
depends: it doesn't matter where in the hierarchy the packages are sited
on the target machine, it will still work.

If you distribute source code, then the dependency list must include
both package names and sites; this is something that can easily be
incorporated into Isaac's distribution toolkit.  We need an easy way to
set up the system with the correct packages & sites for compiling some
code, but that shouldn't be too hard. =20

Interestingly, this is another way that source code can unambiguously
refer to libraries: by package name, site, and module name (with the
former two properties being specified out-of-band in some additional
matter that comes with the source code - perhaps a package specification
or similar).  This is rather less ugly than using GUIDs.  We can also
provide GUIDs, of course, because doing so is cheap.

Cheers,
	Simon