4. Distribution Module

4.1. Distribution.Build

The basic strategy we will take for the actual task of building Haskell tools is as follows:

Since it is obviously the compilers that do the actual compilation, the task of Distribution.Build is more one of coordination among tools outside the compiler. We hope to offer support for preprocessors (both existing and those yet to come). Distribution.Build will handle the task of compiling for a particular Haskell Implementation, or for all installed Haskell Implementations, and help to abstract-away differences between command-line flags.

Distribution.Build could also be used to recompile all of the installed libraries once a new Haskell Implementation is installed. This is an important function, as it solves the problem of binary incompatibility between Haskell Implementations and versions thereof. Another very useful function that Distribution.Build could offer is the implementation of a generic /usr/bin/haskell that either executes a Haskell program using the default compiler or throws the user into the default interpreter, depending upon how it is invoked. This allows Haskell scripts (such as Setup.lhs) to be distributed with a #!/usr/bin/env haskell annotation that has reasonable behavior.

4.2. Distribution.Install

The Distribution.Install module performs the task of moving files into place. Presumably, this is the last task before package registration. Distribution.Install will have to understand configuration options for the operating systems that Haskell modules are being installed on. (For instance, different operating systems have different policies for where to put documentation, source code, binary files, and libraries.) Such information will most likely be read from a file that can be edited by the system administrator (Section 4.4).

Not only will this module have to support standards on different operating systems, but it must have access to filesystem functionality like copy and move, as well as permission-related operations. Such functions should be offered by a library such as System.Posix.Files and System.Directory, but the System.Posix module is not available on all operating systems. To some extent, Distribution.Install should handle the differences between operating systems (file permissions for instance), but Haskell should offer a more robust set of file operations in order to encourage the use of Haskell for common scripting tasks. (One issue that the author has noticed is that System.Directory.renameDirectory is not implemented the same in GHC and Hugs, which forces Distribution.Install to find a way to abstract the differences.)

4.3. Distribution.Package

The complex task of packaging requires a lot of attention. The proposed solution is not only a module to access the packaging information, but also an application to assist external systems with the same task:

The main features of this system are:

  1. To let the Haskell Implementations know how to use a package, whether its available by default (or whether it requires a -package flag), and where the root of its hierarchy is.

  2. To store other information about a package, including information such as its license, home page, version number, and dependencies, to be used by other tools in the Distribution hierarchy.

  3. All information will be made available through the Distribution.Package module. The information can be made available to non-haskell tools by way of a command-line tool, haskell-config (Section 4.4) with easily parsable output (similar to package-config) though a different solution may be necessary for windows.

Some secondary features are:

  1. Let other tools, such as debuggers and editors know where the source code for a module / package is.

  2. When new Haskell implementations are installed, allow them to find the source code and import it into their own library tree (perhaps through other features of the L.I.P.)

  3. For Haskell implementations that don't conform to the new packaging interface, implement a wrapper so that it can still utilize other important features of the Library Infrastructure Project.

The information would be held in a file, such as /etc/haskell/packages.conf[1] and ~/.haskell/packages.conf.

4.3.1. PackageConfig Data Structure

The package data structure might look something like this (based on GHC's Package class)

data PkgIdentifier
    = PkgIdentifier {pkgName::String, pkgVersion::Version}
{- ^Often need name and version since multiple versions of a single
    package can exist on a system. -}

data PackageConfig
   = Package {
        pkgIdent        :: PkgIdentifier,
        license         :: License,
        auto            :: Bool,
        provides        :: [String],
{- ^A bit pi-in-the-sky; might indicate that this package provides
    functionality that other packages also provide, such as a compiler
    or GUI framework, and upon which other packages might depend. -}

        isDefault       :: Bool,
-- ^might indicate if this is the default compiler or GUI framework.

        import_dirs     :: [String],
        source_dirs     :: [String],
        library_dirs    :: [String],
        hs_libraries    :: [String],
        extra_libraries :: [String],
        include_dirs    :: [String],
        c_includes      :: [String],
        build_deps      :: [Dependency], -- build dependencies
        depends         :: [Dependency], -- use dependencies
        extra_ghc_opts  :: [String],
        extra_cc_opts   :: [String],
        extra_ld_opts   :: [String],
        framework_dirs  :: [String],
        extra_frameworks:: [String]}

data Version = DateVersion {versionYear  :: Integer,
                            versionMonth :: Month,
                            versionDay   :: Integer}
             | NumberedVersion {versionMajor      :: Integer,
                                versionMinor      :: Integer,
                                versionPatchLevel :: Integer}

data License = GPL | LGPL | BSD | {- ... | -} OtherLicense FilePath

data Dependency = Dependency String VersionRange

data VersionRange
  = AnyVersion
  | OrLaterVersion     Version
  | ExactlyThisVersion Version
  | OrEarlierVersion   Version

type PackageMap = FiniteMap PkgIdentifier PackageConfig

But perhaps we'll need to be even more flexible: some implementations might not be interested in certain fields, and others might want their own fields. It would certainly be desirable to have a flexible parser so that we can add more fields later and maintain backward compatibility.

The Distribution.Package API might look like so:

userPkgConfigLocation   :: FilePath
systemPkgConfigLocation :: FilePath
getSystemPkgConfig  :: IO [PackageMap] -- ^Query /etc/haskell/packages.conf
getUserPkgConfig    :: IO [PackageMap] -- ^Query ~/.haskell/packages.conf
getPkgConfig        :: FilePath -> IO [PackageMap]
addUserPackage   :: PackageConfig -> IO ()
addSystemPackage :: PackageConfig -> IO ()
delUserPackage   :: PkgIdentifier -> IO ()
delSystemPackage :: PkgIdentifier -> IO ()
basicPackage     :: PackageConfig          -- provides sensible defaults
checkLicense     :: PackageConfig -> Bool
{- Just for fun, check to see if the licences that this package uses
   conflicts with any of the licences of the packages it depends on -}

4.4. haskell-config Command-line interface

The haskell-config [2] tool is a command-line interface to the packaging system. It will presumably be written in Haskell and import the Distribution.Package module. The purpose of this tool is to give non-Haskell systems the ability to interact with the packaging system, since they won't be able to import Distribution.Package. This tool serves a purpose similar to ghc-pkg and package-config.

% haskell-config [--user] register < packageFile
% haskell-config [--user] unregister packageName
# add or remove packages from the package database.  --user indicates
# that we should add it to the package database in the user's home
# directory, not to the system-wide package database.

% haskell-config packageName c_includes
# would output this list in a way that a C compiler could use directly

% haskell-config list-packages
% haskell-config list-user-packages
% haskell-config list-system-packages
# Query the database in a variety of ways

4.5. haskell-pkg?

The haskell-config tool brings up an interesting question. Should the functionality of Distribution.Install also be made available as a command-line tool, perhaps called haskell-pkg ("Haskell package")? In this sense, "package" would refer to that word in the sense that dpkg and the 'P' in RPM mean it: haskell-pkg could be used for installing and removing Haskell programs when supplied with the package metadata that is defined by Distribution.Package.

4.6. Distribution.Config

The information available through the Distribution.Package module is not all of the information that could possibly be needed to prepare a package for installation. Typically, tools such as autoconf are used to discover useful information about the system. The author has not given a lot of thought to the configuration problem, but he sees a few possible paths:

Notes

[1]

Is there any Unix system where etc is the wrong place for something like this?

[2]

Because of the confusion between different kinds of configuration (the kinds offered by Distribution.Package and Distribution.Config) I am torn about the name of this program. There is the further confusion between package management (the actual installation and removal of the programs themselves) and interfacing with the packaging system. Further there is one more bit of confusion between packages in the Haskell system (i.e. a set of modules distributed together by an author) and a package on the operating system. If anyone has an idea to straighten all of this out, I'd be glad to hear it :)