planning for ghc-6.10.1 and hackage [or: combining packages to
yield new type correct programs]
Duncan Coutts
duncan.coutts at worc.ox.ac.uk
Thu Oct 2 16:39:51 EDT 2008
On Thu, 2008-10-02 at 09:37 +0100, Simon Peyton-Jones wrote:
> | Here are my notes on why this is so hard, and so scary, and what we can do.
> |
> | So Duncan and I spent about 6 hours tonight working out how to make the
> | cabal-install constraint solver, and the Cabal configure invariants,
> | work when programs attempt to use base-3 and base-4 on the same system.
>
> Here is a small thought.
>
> * base-3.1 depends on *exactly* base-4.0. You can think of this as
> saying "base-4.0 is simply part of base-3.1's implementation. It
> should not matter to a client how base-3.1 is implemented."
>
> * You want to compile packages that depend generally on 'base', but
> don't specify an exact version number
>
> * So here's a possible rule. If a package depends on a precise version
> of a package, use that precise version. But for inexact dependencies,
> use the one-version-per-package rule. So all packages that depend on
> "some version of base" should depend on either base-3.1 or base-4.0
> but not both.
Here is how we characterise a solution:
* It is a set of package identifiers, where an id is (name,
version)
* More specifically it is a closed directed acyclic graph of
package ids where edges are dependencies (exact not ranges)
* For each node (package id), the collection of dependent package
names contains no duplicates (ie no package id directly depends
on two versions of another package name)
That's all uncontroversial, in the old/current system we impose a
further global restriction:
* In the whole graph, any pairs of dependencies on the same
package name must specify the same version of that package.
This restriction is to eliminate the diamond dependency problem where
packages B and C export functions that use a type from package A, but we
build B using A-1.0 and C using A-2.0 and then the types when used
together in a fourth package D do not match up.
I've written about this before:
http://blog.well-typed.com/2008/04/the-dreaded-diamond-dependency-problem/
http://blog.well-typed.com/2008/08/solving-the-diamond-dependency-problem/
Note that this restriction is almost as strong as saying that there is
only one version of every package name in the solution.
Clearly this is now too strong for base 3 and 4. Lets look at the
situation we want to allow and a similar situation we want to avoid:
We want to allow:
A-1 depends on B-1
A-1 depends on C-1
B-1 depends on base-3
C-1 depends on base-4
base-3 depends on base-4
We want to disallow:
A-1 depends on B-1
A-1 depends on C-1
B-1 depends on bytestring-0.9.0.1
C-1 depends on bytestring-0.9.0.4
In both cases we have two dependencies in the proposed solution that
target different versions of the same package. In the first case B and C
depend on base-3 and 4. In the second example B and C depend on
bytestring 0.9.0.1 and 0.9.0.4.
We can identify that the first is ok because there is a direct
dependency between the two packages that are apparently in conflict
while in the second there is not. We can optimistically assume that in
the first case they can be used together and in the second we should
conservatively assume that they do not.
So that's how we can adjust the definition of a correct solution. I
pushed a patch yesterday to Cabal to the dependencyInconsistencies
function to use this new definition.
The next step is to adjust the solver to actually find solutions that
make use of this relaxed definition. I think we can do it by adjusting
the implementation of the constraint set ADT.
The Constraints ADT has these three ops (simplified):
-- An empty set of constraints given a set of installed packages
-- (from ghc-pkg) and a set of available packages (from hackage)
empty :: PackageIndex installed
-> PackageIndex available
-> Constraints installed available
-- return the set of remaining package choices, essentially a number
-- of pots, one per package name, we will have to pick one version from
-- each pot
choices :: Constraints installed available
-> PackageIndex (InstalledOrAvailable installed available)
-- Add an extra constraint into the constraint set and checks if the
-- constraint set is still satisfiable (ie none of the pots are empty)
constrain :: Dependency
-> Constraints installed available
-> Satisfiable (Constraints installed available)
data Satisfiable a
= Satisfiable a
| Unsatisfiable
| ConflictsWith [PackageIdentifier]
And that's it.
The trick is to notice when we initially construct the constraint set
that we have base-3 and base-4 in the installed set and that one depends
on the other. In this situation we should allocate two pots rather than
one. Then when we add constraints later we have to decide which pot to
apply the constraints to. Each constraint on base must be ok for one pot
or the other or both. If it conflicts in both then we still fail (eg
base < 2).
I hope just by changing the definition of what it means for constraints
to conflict in this Constraints ADT that the rest of the solver will
continue to work without many modification.
Duncan
More information about the Cvs-ghc
mailing list