Darcs vs. Git

From HaskellWiki
Revision as of 19:40, 11 December 2012 by Lemming (talk | contribs) (→‎Pull and stash: code block)
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

What makes Git so popular even amongst Haskell programmers? I remember that GHC team switched because Darcs had problems with merges and became slower as the history of a project evolved. On the one hand this is a problem of Darcs that must be improved. On the other hand GHC is a big monolithic project which is not only a problem for a versioning system but is also unfortunate for users who want to use selected subsystems of GHC, like the parser, the module system, the type checker, the optimizer, the code generator and so on.

In the meantime I worked with Git for several projects. I hoped that if I work for those projects then their maintainers could help out with Git problems. I expected that if they switch from Darcs to Git then they made good experiences with Git. This is not the case. In those projects nobody could give satisfying solutions to even common Git problems. Thus I started to collect the issues I frequently have with Git. They make me stay with Darcs for the projects where I have the choice of the versioning system.


darcs replace

Darcs can replace identifiers, Git cannot. I often rename identifiers. The identifier substitution of Darcs both saves space and allows for smooth merging. In Git renamings of identifiers look like you alter a lot of lines here and there. I don't think that Git can easily implement that feature, because it has no notion of a patch.

darcs check --test

Darcs lets you easily run a test suite after every commit. Usually I register

cabal configure && cabal build && cabal haddock && cabal test

as a darcs test. After recording a patch, darcs unpacks the repository temporarily in the state after adding the patch. Then it runs the test suite within that temporary copy of the repository. If you add a file the Cabal description but forgot darcs add or vice versa, then the darcs test will quickly spot the problem.

The same is almost not possible with Git. It could certainly be hacked into .git/hooks/pre-commit.sample, but the crucial feature of running a test is to reject a commit if it does not pass the tests. If there is a way in Git then it is by far more complicated than in Darcs.

I see no reason why Git does not support pre-commit tests properly.

Pushing to the wrong repository

It is very easy in Git to push commits to an unrelated repository or to the wrong branch of a repository. And it is cumbersome and dangerous to get rid of the wrongly pushed commits, if operating in a bare remote git repository. In a bare git repository you cannot use commands like 'git branch' that you are familiar with in a working copy. In darcs this cannot happen so easily since normally darcs asks you which patch to push. This way you can see early if something gets wrong.

Pushing to repositories with working files

Pushing to a bare Git repository requires that you enable the post-update hook. Why? Pushing to a Git repository with working files is even more cumbersome. I like to point people to certain files in a remote repository, e.g. by an HTTP URL. This requires automatic updates of working copy files when pushing to the remote repository. If you try to push to a Git repository with working files that is not prepared for this action then you get a message that it is not possible and what alternatives you have. Among the alternatives there does not seem to be one that automatically updates working copy files after a push. Maybe it is possible in Git, but it requires at least some effort.

In darcs there is no distinction between bare repositories and working copies. Working files are always automatically updated.

Pull and stash

I often have locally modified files when pulling patches from somewhere else. With git this needs at least three steps:

 git stash
 git pull
 git stash pop

With darcs it is just darcs pull. Git does not make backup files if there is a merge conflict, Darcs does.

Branches

When working with Darcs I missed branches. I thought it would be a good idea to let the versioning system manage my attempts to solve hard problems in my projects. I expected that if people clone my repository or pull patches that they access all branches of my repository simultaneously. I hoped that native support for branches would be better than creating a new working copy for every branch.

Maybe Git got branching wrong but today I am uncertain whether there is a good way to support branching other than not supporting branching. Git does not handle all branches of a repository simultaneously. Git can push and pull commits from one local branch to another remote branch and back.

It is not so simple in Git to switch a branch if there are local modifications. You need to git stash. And what about the open text editor that contains a file that gets modified by switching a branch? Sure, modern text editors warn about changes on the file storage but it is still easy to accidentally overwrite a file on one branch with the contents of that file on another branch.

What I actually ended up is to create one Git working copy for every branch. This way, switching between branches is easy. This is precisely the way I work with Darcs.


GitHub

Maybe the popularity of Git is not due to Git's features or its design, but due to the availability of graphical user interfaces and web interfaces. The most popular git repository host and web interface is certainly http://github.com/ . For my taste it is a very poor page. It requires JavaScript and seems to be restricted to mainstream browsers. Even on Firefox github sometimes freezes due to a script loop. But if you want to contribute to a Github project you are forced to use a web browser for some steps.

Getting a local snapshot of a repository is the first challenge. After the HTTP switch Github tells an URL of the repository, but it does not tell the command to fetch it (git clone). If you click on the SSH switch while not logged in, you get a generic 404 HTTP error.

How to contribute to a Github project? It's not as easy as darcs send. You have to clone the repository at github into your own github account, clicking through the web interface. Did I mention that you have to register at github first? And you have to have your github password at hand. Then you clone a local working copy from your repository clone at github. Then you make your modifications locally, commit them and push them to your repository clone at github. Then you turn back to the web browser and make a pull request for your commit.

Say, your commit is rejected by the project leader for some reason. You have to improve it a bit before it can be accepted. But how to get rid of the rejected commit at github? Erm, uh, that's not possible. You should have created a branch before experimenting with your modifications. How to create a branch at github? How to delete a branch if it becomes obsolete? Do I need as many branches as commits I want to contribute to the project?

I want darcs send back!

Merging

Merging is something that I used to avoid in Darcs-1 since merges make Darcs so slow that I thought more than once that Darcs hangs completely. Sometimes I could accelerate pulling patches considerably by pulling patches out of order. That is, I skipped some patches and pulled later ones first as patch dependencies allowed it. But no question, this is not professional. Additionally, there must have been bugs, such that repositories became inconsistent. Very bad. Darcs-2 promises to be faster on merges using the new hashed format, but it is reported that Darcs-2 is faster in some cases and slower in others. I don't know the details. Thus I stayed with Darcs-1 since I didn't want to invest much effort into another temporary solution.

Git really has no problems with merges. It works. Period.