Mercurial Woes

Over the past few days my friends Ben Collins-Sussman and Jim Blandy and I have been having an interesting conversation about the use of Mercurial for development collaboration. Eventually one of my email responses got so long-winded that I figured it’d be best to make the conversation public.

So, sick here’s my take on Mercurial, and some reasons for why a HG birds-of-a-feather session at the Mozilla Summit coming up next week would be very useful for me.

To begin with, from a purely social standpoint, the concept of distributed revision control is amazing to me because of how it removes many technological barriers to collaboration, providing software projects with an enormous amount of freedom on how their development process is structured. For any readers who aren’t familiar with it, check out Chapter 1 of the HG Book.

But with all this additional freedom comes additional responsibilities. To quote the MDC Mercurial basics: this gun is loaded.

I’m used to working on small-to-medium sized projects with relatively small teams. Subversion was great for this, because when we were working on things simultaneously, we rarely ran into situations where we were editing the same file, much less the same part of the same file. There were rarely reasons to need to create SVN branches—though we all knew how to do it, and did it when necessary. But as a result, merges were very rare, and when we had to merge, we were extremely careful and diligent about it.

I’m still working on relatively small-to-medium-sized projects (e.g. Weave and Ubiquity) and the forced merging that HG makes us do almost every time we push is a world of pain, relatively speaking. With SVN, I’d just svn commit and see if SVN rejected our commit because someone else committed a change to the same file while we were editing it—this happened rarely, and when it did, we were careful about ensuring that our changes gelled. In this sense, SVN was really humane; 90% of the time things “just worked”, and when things didn’t just work, it was for very good reasons.

But HG almost never “just works”. If I edit a.py and my friend edits b.py and pushes it before I’ve pushed my changes, I have to make a merge commit and manually ensure that nothing bad happened. The end result of this is a huge burden on each programmer compared to SVN, as they have to do a separate merge commit for nearly every push they make, which essentially encourages people to either (A) not push often or (B) ignore their merge commits (a practice which is encouraged by the use of hg fetch). The disadvantages of the first approach are nicely explained by Ben’s post on Programmer Insecurity; the latter approach is bad for obvious reasons.

This is basically the axis around which all my woes with HG revolve. With SVN, it’s really easy to see how code has changed, but because of the constant merging of tiny branches in HG, the whole code history becomes obfuscated and it’s hard to tell what’s happened to it. In fact, several weeks ago a friend somehow mis-merged his commits to Weave, which undid a major refactoring I did, and the really scary thing is that it was somehow impossible for me to tell this had happened from looking at the diff logs alone. I looked at them for a good half-hour or so and was still scratching my head. Needless to say, my inability to understand what had happened to the code by looking at the logs drastically reduced my faith in the tool.

While everyone I know understands the basics of HG and the philosophy behind distributed VCS, it’s the particulars of actually “working in the wild” that many are finding very confusing. So a HG BoF at the Mozilla Summit would be extremely useful.

27 Replies to “Mercurial Woes”

  1. The fundamental problem here is that we’re trying to use a distributed VCS in a way that it was not meant for. Using a central repository to which everyone pushes is not a model well suited for HG.

    To use HG effectively, we have to change our development model to a more Linux-like manner.

  2. learn about hg mq. I never do merges as I find patches easier to work with. I do all of my development in an mq patch and then do hg up, hg pull, hg push and adjust the patch manually before turning the patch into a commit.

  3. @Anant:

    Hmm, based on what I read in chapter 1 of the HG book (linked in the original post), the goal of distributed VCS systems like HG is to give you more freedom in your development process—but it’s not explicitly created for any particular system. Based purely on the explanation in that chapter, I’d expect a standard centralized development model—the kind that CVS and SVN are made for—to be just as easy to use in HG as they are in CVS and SVN. If that’s not the case, though, then the author of the HG book should probably consider rewriting that chapter.

    @Taras:

    Thanks for the advice—I started looking into MQ a few days ago for something else, but it does seem powerful. I’ll look into using this.

  4. try Git. Linus and Junio and many others have figures out how to make this work. You can use Tailor to go-between hg and Git.

  5. Guess it’s time to learn about MQ — I remember someone recommending looking into it but somehow I never got around to it. I’ve managed to find various workflows that minimize the intermediate-merge-steps necessary, but yeah, it seems like an awkwardness that shouldn’t be necessary.

    While we’re complaining about HG, can I add:

    — would sure be nice if hg diff could ignore line-ending differences; supposedly there are some ways to force line-ending normalization but AFAIK they involve external utilities. I’d like to see this functionality built in to HG.

    — a GUI for examining diffs (that actually works on Mac + Windows + Linux) would be super-super useful. There was a pretty good one (checkintool) formerly available at http://software.jessies.org/scm , but it’s not there now (the developers signed a BitKeeper license and pulled it rather than risk legal wrath, apparently).

  6. I strongly recommend MQ:

    – No merge anymore, rebase yourself before pushing anything. Make the world clean.
    – Qrefresh often, and view diff from the web interface.
    – If the local repo gets into an unstable state, just re-clone it. Keep your patches safe.

  7. When I was at Flock we used HG for a while. It was a HUGE pain – resulting in us moving to SVN (with several devs using git-svn so you get some of the advantages of git).

    I’m still surprise that Mozilla chose HG without talking to Flock about issues they ran into. Best of luck – I hope this doesn’t hurt Mozilla as much as it hurt Flock – although it seems like it has since this isn’t the only post I’ve read about HG + MOZ = pain.

  8. > To use HG effectively, we have to change our development model
    > to a more Linux-like manner.

    I heard about using pull model as a solution but I don’t understand how does this change the situation with merge commits? The pulled changesets are still based on some old changeset forcing the puller to do the exact same merge commit that the pullee would have done. Isn’t the burden of merge tracking just the price we have to pay for DVCS and rebasing changesets the only workaround?

  9. I would imagine more training and BoF’ing will prove *helpful*, but I’m skeptical that it will really address the core problem.

    What I’ve found interesting is the amount of mumbling I’ve heard from a bunch of corners of Mozilla-land about Mercurial, but of course, the horse is out of the stable, so it seems no one is willing to suggest that maybe a mistake was made. (A post-mortem on the decisionmaking process there, and the rollout would surely prove interesting, but I’m not holding my breath.)

    I’m with Anant in that Mozilla is using a distributed version control system in a very centralized way, not because it can’t be done any other way, but because Mozilla doesn’t employ a “lieutenant”-type development process, like the Linux kernel. So at the end of the day, you have a tool that supports distributed version control, and everyone’s pushing to a centralized repo, where decisions are made centrally (aka “bug triage”), so as it is now, it’s a lot of extra-tool work for not-a readily visible gain. I wonder if anyone has posed changing this core process as a way to increase the usefulness of Hg?

    I also think the roll-out was premature, substituting handwaving for concrete implementation plans, leaving lots of parts—l10n, non-Firefox-and-Xulrunner projects, code browsing support tools, etc.—unthought of until… well… now, when people actually started using it.

    That’s not a… great way to switch a core tool any group of developers use.

  10. As a Mercurial developer, I can understand where you’re coming from. Unfortunately, I am unable to attend the Mozilla Summit, otherwise I could have been there and hopefully point out things that would work better. DVCS is a hard problem, and the way we solve it works, but at a cost. The pushlog ted and bsmedberg have been working on helps sometimes in getting a clean overview of commit history. Otherwise, MQ helps a lot with preventing merge commits (I use it a lot, myself), but we also have a SoC project going for a rebase command, which is looking good and should make things much, much easier in this department.

  11. Another good cross-platform merge tool is kdiff3 ( http://kdiff3.sourceforge.net/)

    My experience is that a good 3-way merge tool *with* automatic merging support is the basis of having good experience with hg! Mercurial packages that I’ve seen do not have this configured by default 🙁

    Without good merge tool I’m forced to do interactive merge even if the changes do not conflict. The worst thing is that many tools have only 2-way merge support. They show a view of changes that is very misleading (leading to merge commits that accidentally remove other’s commits). 2-way tools show my changes compared to changes that others have made, while 3-way tool shows my changes compared to the base version *and* the changes that others have made also compared to the initial base version. For me this makes all the difference.

  12. Use “hg fetch” to combine pulling and merging into a single command that runs automatically. It’s that simple. The command is even bundled as an extension with every release of Mercurial.

  13. > If I edit a.py and my friend edits b.py and pushes it before
    > I’ve pushed my changes, I have to make a merge commit and
    > manually ensure that nothing bad happened.

    This is a big plus for HG. In SVN you can change a file and check in without bothering to check if you break the anything or not.

    > a GUI for examining diffs

    kdiff3 is working really well:
    [extensions]
    hgext.extdiff =
    [extdiff]
    cmd.vdiff = d:/home/bin/kdiff3.exe

    I think you have workflow problems.
    A few things:
    – keep the central repo linear, you do not need local commits, the whole change is enough in one commit (you can do your review in the developers repo)
    – merge tracking is good, but it is mainly to keep your repo in sync with an another one or to do code reviews
    – rebasing is a tool to throw away irrelevant developer history and to change the order of patches

    You can either learn to use MQ (not my favorite, but with a few scripts it does the job) or to master named branches (a lots of manual merging). MQ is powerful, but really easy to get lost with it if you are not really interested in the internals of HG. (but if you understand what this does: http://www.selenic.com/mercurial/wiki/index.cgi/MqMerge you will love it).

    I think git ~= hg+mq. It does the same repo editing natively, but I like HG more (simple python code, you can track down bugs easily, easy to install, works well on windows, nice cli insterface, very good plugin system, …).

  14. > I have to make a merge commit and manually ensure that nothing bad happened.

    How do you need to manually ensure that? In most cases ‘hg merge’ is the same as ‘svn update’. One thing that is different is that it is required not only when same file changed, but when any file is changed. I don’t see any trouble with that, because this is just natural – changes in other file can easily affect changes in your file, so it is nice to have possibility to check.

    But you don’t need that, and IMHO saying that merging is painful is not very fair.

    Use ‘hg fetch’, if you don’t like typing ‘hg merge’.

    > Compared to MQ, rebase has the advantage that you don’t have to remember to “commit with MQ”, but can just commit as usual.

    With MQ you don’t need to remember to “commit with MQ”, as you can easily import changesets later.

  15. svn’s default mode of operation is rebase in DVCS terms and that’s how you should work in a centralized-repo-based environment. I use git personally and it comes with rebase by default; good to hear that hg is going to have one, too. for now, I’d suggest using queues.

  16. re: hg fetch, sounds useful, but I didn’t know it existed. (it’s interesting that both fetch and MQ are suggested here, but both are apparently extensions that aren’t configured “on” by default… er, if they are as useful as they sound, why wouldn’t they be enabled by default?)

    Re: kdiff3, I’ve found kdiff3 to be a bit flaky on OS X; for 3-way merges I’ve actually started using p4merge (part of the Perforce toolset, ironically enough, as Adobe uses Perforce internally for most projects), which I find to have a slicker GUI for this.

    re: DiffMerge, I hadn’t heard of it, I’ll check it out.

    To the person who suggested Bazaar: IIRC it was strongly considered, but its performance was found to be unacceptably slow on the giant source code base that is Mozilla.

    To the person who suggested git: IIRC, it was rejected because one requirement was something that was a first-class citizen on Windows as well as Unix of various flavors, and git was reputed to be fully happy only on Linux (though perhaps this is no longer the case, and/or I misremember…)

  17. > Re: kdiff3, I’ve found kdiff3 to be a bit flaky on OS X

    kdiff3 is a tool which looks ugly, takes 5 minutes to learn and realize that it does everything what is needed to get the job done: quick keyboard navigation/operation, good word wrap (which is rare), you can change split orientation if you need more information, false diff regions can be adjusted manually

    > re: hg fetch, sounds useful, but I didn’t know it existed. (it’s
    > interesting that both fetch and MQ are suggested here, but
    > both are apparently extensions that aren’t configured “on” by
    > default… er, if they are as useful as they sound, why wouldn’t
    > they be enabled by default?)

    Hm, I wouldn’t like to be too harsh, but this means that you are using a DVCS and you don’t even bothered to read the documentation. (fetch is introduced early in the hgbook and MQ has got 2 chapters).

    MQ is too advanced to be enabled by default, and using “fetch” isn’t a good practice.

  18. > To the person who suggested Bazaar:

    Strange. My name is right there on the comment that you’re referring to.

    > IIRC it was strongly considered, but its performance was found to be unacceptably slow on the giant source code base that is Mozilla.

    Yes, I remember that decision. In the meantime, and in part because of that feedback, much progress has been made on Bazaar’s performance. That may be too late for the Mozilla project, but it’s not too late for many others.

    For anyone considering a distributed VCS but wanting to avoid the problems in this weblog entry, consider Bazaar on its current features and its current performance against your own code base, which is much better now than when the Mozilla team evaluated it.

  19. > My name is right there on the comment that you’re referring to.

    Ben: sorry bout that, was replying in a hurry, wasn’t trying to be rude.

    > you don’t even bothered to read the documentation

    teki: actually I did read the documentation. I missed (or forgot) about “fetch”, but I was aware of MQ — it didn’t look essential when I first started using HG, then I found other workarounds for the merging issues.

  20. This makes me quite cross. RTFM. hg merge is not doing anything more or less that svn update is doing. How is it different??? It doesn’t need to happen before a commit. That’s the only extra bit of flexibility you have. And it’s really good flexibility.

    I don’t understand why you were not able to track what happened with the diffs. I run a big project out of hg with 10 devs. I never have a problem. People have done what you experienced to me (lack of understanding is widespread wth vc expecially with dvc) but I could track it easily.

    Personally I just use the hgweb tool to track what’s going on.

  21. I have also unintentionally reverted other people’s commits in hg. How that happened I never figured out. I’ve also missed the lack of something equivalently easy as “svn up”. Sure, hg merge serves the same purpose (well, kind of, along with hg pull?), but it’s *not* as simple. (Maybe hg fetch is a better equivalent, I’ll have to try it.)

  22. @ian
    I think the top reasons for accidentally overwriting other people’s work is either bad merge tool that does not do trivial automatic merging *or* quiting the merge application without saving the result of merge operation while still committing the (unmerged i.e. not updated) code.

  23. “svn up” is a “rebase”, you are rebasing your local changes to the newer version of the code.

    ———————

    svn up with named branches:
    # hg branch mybranch
    # hg commit -m ‘save local changes’
    # hg pull
    # hg merge mainbranch
    # hg commit -m ‘merged with main’

    You can alvays do a diff or status against the main branch:
    # hg diff -r mainbranch
    # hg status –rev mainbranch

    ———————

    svn up in hg with mq:
    – save local changes
    # hg qrefresh
    # hg qpop -a
    – pull changes
    # hg pull
    – update wokring copy
    # hg update
    – rebase
    # hg qpush -a (or http://www.selenic.com/mercurial/wiki/index.cgi/MqMerge if you don’t like to fix broken patches, this is a 3 way merge [GUI can be used], a bit better than just to push)

    The last step is automated in Git and will be easier in the future in HG too.

    What else can you do:
    – after qrefresh you can use qcommit and save your intermittent patch (and have multiple versions)
    – you can have multiple patches and you can work on them independently

    —————

    Possible workflows:
    – use the hg merge method and rebase changes when finished (if you really miss svn up, write a script: check if there are any update with “hg incoming”; if you have outstanding changes “hg commit” ; hg pull; hg merge mainbranch )
    – learn and use mq (I am not sure that it worth the trouble), I am using it to track changes on top of a CVS managed codebase. For that it is really handy.

Comments are closed.