Ian Bicking has written two thoughtful articles on distributed and centralised revision control systems (the first, the second).
His original thesis was one that he shared with Greg Hudson, one of the Subversion developers: for free software, distributed SCM bad, centralised good. He had four main points, which I have reduced to three:
0 Work ought to be visible to other developers, and easy to find. 0 It should be easy to understand the changes made. 0 It ought to be simple to integrate changes.
Even though the first point seems like the free software equivalent of mom and apple pie, I think a more nuanced approach is appropriate.
Much of the free software work I do is in the context of my day job, where I hack on a variety of different packages. For some of these projects, there’s clearly no commercial value to keeping our local modifications private, and in fact it can be to our advantage to share changes as openly and frequently as possible. However, in other cases, it makes business sense to hold off on contributing changes to the community until “the time is right”.
Even with projects I hack on in my own time, I don’t necessarily want to share everything I do with others. My changes may be incomplete, experimental, or going down an embarrassingly wrong path. By avoiding the “everything should be shared” approach, I can choose the manner and time of the presentation of my work.
In each of these cases, it is to my benefit to use distributed SCM tools. I get the benefits of change and merge management, with control over how and when I publish my work.
Where distributed SCM tools currently fall down is in the visibility of work that people do want to share. If I use Subversion, there’s only one place I need to look for public branches of any given project. But if I use a distributed tool, I must rely on a hodgepodge of mailing lists, blog postings, and other unreliable information sources.
Ian pointed out that he did not know how well distributed tools served the purposes of understanding changes and integrating them.
The means for understanding changes varies by tool. For example, with Mercurial, there is not yet a way to say “show me the differences between these two repositories”, but we intend to add this feature.
Integrating changes is a more uniform operation. All distributed tools have a way of saying “bring changes from this branch into this other branch”. Apart from the mechanics of how they handle merges and conflicts, they function in the same way from an end user’s perspective.
In Ian’s second article, he contends that distributed tools “are avoiding the technical issues of sharing”. I don’t understand what he was trying to say here. Making a distributed SCM behave like CVS or Subversion is just a special case. You set up a server, create a few branches to represent the main line of development and other “first class” branches, then developers pull and push changes from their local repositories over to the anointed ones as they need to.
Ian also says that “the way these systems rely on email, rsync, ssh, etc., is simply avoidance”, another pronouncement that I find peculiar. CVS and Subversion rely on ssh for commit access, and email to notify developers of changes. When I use distributed tools in CVS-like fashion, I rely on ssh for commit access, and email to notify developers of changes. Where’s the difference?
Ah. In a subsequent comment (don’t you detest the fact that blog software treats comments as shabby afterthoughts?), Ian clarifies that he is using Windows, and his experience using ssh there is not pleasant. While Subversion allows integration with Apache to provide commit access, I do not regard this as an improvement; it moves the pain to a system that is considerably more difficult than ssh to configure and secure correctly.
[…] – nor every project – and Havoc helps explain why. He also links to some other developers thoughts pro and con. Here are two bonus links comparing Mercurial to […]