Hackage 2 status

Duncan Coutts duncan.coutts at googlemail.com
Mon Jul 2 21:14:01 CEST 2012


On Mon, 2012-07-02 at 12:25 +0100, Ian Lynagh wrote:
> Hi all,
> 
> I'm planning to spend some time, on behalf of the Industrial Haskell
> Group, working on Hackage 2 in the coming weeks.

[..]

> Now #913 I assume is not a blocker. #919 I assume is also not a blocker.
> And #914 and #915 are improvements to the internals, so presumably also
> not blockers. #426 is not supported by Hackage 1 as far as I can see, so
> is not a blocker as it is not a regression.

I agree with this analysis.

> So that leaves 3 tickets as blockers:
> 
> #911: We need to do something here. With Hackage 1, it takes manual
> approval before you can upload packages, and at the very least Hackage 2
> should match that. I have the impression that that is already possible
> (by restricting package upload to a group, and requiring accounts to be
> added to that group by an admin), but I haven't confirmed that yet.

Right, I don't think we need to do any more than make sure uploaders are
in the appropriate group. It *should* currently be the case that only
accounts in the package group can upload, and the first time you upload
a new named package then you get added as the initial member of the new
package group.

Currently for testing purposes anyone can register an account and can
then upload new packages. We have two options here: restrict account
creation to be manual like in hackage 1, or add a new system-wide
"uploaders" group for accounts that are authorised to upload new
packages and have a manual admin step to add people to the uploaders
group. The latter will allow for registered users who are not uploaders
which would be useful later to allow things like non-anonymous
commenting etc.

> #916: At the very least, Hackage 2 needs to support URLs that Hackage 1
> supported (unless a conscious decision has been made not to). Ideally we
> would get the URLs right on the initial release, so that people don't
> start using the wrong ones. Doesn't sound hard, so may as well do before
> the switchover.

Yes, barring mistakes this should work already. There's a "legacy"
feature module that provides a bunch of redirects.

> #918: This is the main missing functionality currently missing from the
> user's point of view.

Right. You'll see there's some code for a doc builder. This needs to be
improved. Perhaps it can share code with the mirror client which is
reasonably robust.

> Conclusion
> ----------
> 
> I think the following are the blockers for deploying Hackage 2:
> 
> * #911 upload perms; may be good enough already
> * #916 check URLs are OK
> * #918 build haddock (and HsColour) docs

Right.

> * Show source respository on package pages

Should be easy to port that from the old code.

> * Support the existing "Distributions" files, and show info on package pages

I advocated at the time the feature was added that it should be done
differently so that the hackage server does not poll some url, but
people in charge of distros push instead. I think it would not be a
blocker to not implement the distribution info system as it is now and
when eventually spending the time to implement it, switch to doing it in
a more sensible way.

> (plus enough testing to give us confidence in it, of course).

One of the main things here is adding tests that the database
dump/restore mechanism round trips correctly.

> Does that match other people's opinions? Did I miss anything?

Looks good.

Something to keep in mind is memory usage. I know Jeremy is looking at
this from the infrastructure side, but I think from the app side there's
also some likely culprits. Cabal's GenericPackageDescription type is
very large in memory. Having 10's of 1000's of these means lots of
memory. One hopefully easy way to save memory here without going to the
hassle of redoing Cabal's type definitions is simply to increase
sharing. There's a huge amount of repeated information. Start by sharing
all the package names and versions. Then there's other meta-data that
rarely changes between versions of the same package. This kind of thing
should be easy to evaluate, just write a test prog that reads the index
file and look at peak memory use. Then try sharing stuff and see how
much it drops. This sharing optimisation would still be useful even if
later we go and redo GenericPackageDescription to be more compact.

Duncan




More information about the cabal-devel mailing list