Optimizing Globally Distributed Enterprise Source Management

June 19, 2009 Jack Repenning

In the echo chamber of developer tools, a hot topic these days is "distributed vs. central version control." The debate has an unfortunate tendency to polarize into "developer vs. organization," which of course doesn't really help anyone at all. WANdisco's Jim Campigli, writing as "SubversionMan" (jeez, Jim: show a little respect for the nearly one thousand committers and other folks who've actually contributed to Subversion!), tackles a more productive, if more narrowly nuanced, question: how can we preserve the organizational benefits of central systems, while mitigating the obstacles for the developers?

In his latest screed, Jim identifies three key trade-offs in version control for global organizations: WAN costs, server performance, and server availability. That's a pretty good list; I'd only add one more: support of enterprise-scale work flows. Having agreed on the list, though, I'm not sure we agree on the outcomes (but, hey, that's what blogs are for, right?):

WAN costs (both latency and bandwidth) are a complex problem, with many handles. In the enterprise world (unlike the open source world), it's often possible to work with your network provider to improve network performance. Indeed, global enterprises often already control their networks explicitly, and performance problems are very often only the result of configured "Quality of Service" policies colliding with evolving use patterns, which can be reconfigured once the right measurements meet with the right people. 

Server performance is another of those things that can be managed in an enterprise context, and particularly in a good hosted deployment like CollabNet's. The Subversion server code is actually remarkably efficient. Leaving aside pathological cases like runaway spiders, a single Subversion server host can easily support tens of thousands of active enterprise developers, or hundreds of thousands of open-source developers (who, as an over-all average, are less active). As for detecting and preventing the pathologies, that's what quality hosting is all about.

Server availability is much the same: good management can provide very high availability, fail-over, and disaster recovery at reasonable cost and switchover times of only a handful of minutes, with complete transparency to any user not actually performing some operation at the moment of failure. A fail-over strategy that requires end-user reconfiguration when your connected server fails gives the end user some sense of control, but ends up costing each user the switch-over time, regardless of their level of activity at the moment of failure.

Enterprise work flows are one more consideration. A global enterprise will involve many individual repositories, which typically need to be accessed by multiple teams in various groupings. In any replication strategy, the administration of these groupings becomes key. The simple extremes, where either every site has all repositories, or every repository is unique to only one site, are pretty rare. Whether due to co-development, collaboration, shifting team composition, or 24×7 support commitments, most global enterprises need a replication map that's both complex and constantly evolving. Simple central hosting completely eliminates these challenges, of course. 

Previous Article
Subversion + Eclipse3.5 = Easy!
Subversion + Eclipse3.5 = Easy!

Read how easy it is to install support for Subversion in Eclipse 3.5. Read more ›

Next Article
Open infrastructure and contribution models
Open infrastructure and contribution models

Matt Aslett of the 451 Group has provided a great example of some powerful yet overlooked aspects of some o...