Jun 15
2008

Distributed Software Development

Posted by pablo santos in Untagged 

pablo

Distributed Software Development is all about enabling teams and individuals to seamlessly work through the Internet as if they were sitting close to each other, even when they’re worlds apart.

Software Configuration Management (SCM) systems play a key role as the cornerstones of asset distribution and sharing among the team members.

A simple picture of the software development tool stack will show the SCM sitting very deep in the chain. It is the cornerstone to build all the development process around. The SCM distributes the code, diagrams, documents, design material, help files, everything which composes a development. Then tools like compilers, build systems, debuggers, IDEs, profilers and so on will take these assets to help developers create the software.

Software Development Stack

Any change performed using the different tools available will be controlled by the SCM, and it will be made available, on demand, attending to certain constraints, to the rest of the team.

And the team, even when they sit on the same room, is on a network. Documents, sources, images, html files and design diagrams, they all have to travel through the network from one computer to another.

All changes are tracked by the SCM server, which resides on a separate machine on the network. It is the center piece to make the development team flow.

And then the Internet enters the scene:

  • Different teams can be on very distant locations, not anymore on the same room or even the same building, but they need to continue working on the same code base, sharing the changes, evolving the software.
  • Developers can be located at the client’s site, making specific changes attending to detailed feedback given on-site.

Then the SCM server has to continue providing the same range of services to the developers, but this time they’re not on the same network, not even on the same network domain.

The first option is simple: if the team relies on network facilities, then they can continue working on a virtual network (VPN) even when they’re worlds apart.

Unfortunately the story is not that simple and easy.

  • Network connections can be unreliable or slow, making the distant team members work slow, error prone and unproductive.
  • Direct connection to the SCM server can be discouraged due to security restrictions or simple unavailable.

Of course developers can still work on their local code copies but then we can’t pretend to say they get advantage of the same facilities they’ve when they sat on the office. They don’t even have the basics!

DSD

Beyond methodologies, beyond best practices or the preferred programming language of choice, there’s something which really makes a difference: you need a tool to collaborate. You modify some code at your office at San Diego and a folk at Bangalore needs to take on it while you go to sleep. Of course you would like your teammate to work on the right version of the code...

DSD is all about versioning. There are other related issues, of course: from challenges in project management to dealing with different time zones to handle with a variety of cultures all around the world. But, primarily, you’ve to create an environment where people can work almost as they were all sitting together on the same room, even when they’re worlds apart, at least code wise.

DSD approaches

So far the industry has provided a range of solutions to address the distributed software development issue, ranging from the VPN based solutions to full distributed SCM.

DSD support

  • Centralized SCM: it is the conventional approach. There’s a single server hosting all the changes and coordinating all the developers’ efforts. When multiple separated locations enter the scene, the system relies on the networking facilities to continue providing the service to the users. VPNs are a regular alternative on this scenario. Other options are internet servers were connection is directly set up from the client to the server through the internet, without a VPN. This centralized approach has all the drawbacks of the underlying network infrastructure: if the network goes down, the impact is clear on the developers. SVN, Source Safe or CVS are clear samples of this alternative.
  • Proxy based multi-site: the central server is helped by a set of proxy servers which act as data caches. When the network goes down the clients can still access copies of the data through their local caches. The benefit is an enhanced capability to work on disconnected scenarios. The downside is that proxies aren’t full servers so they’re normally only able to support read access but they can’t provide write operation support. They can also cache write operations (like check-ins) if they implement a delayed operation mode but concurrent changes aren’t allowed. It is a clear step ahead in terms of enhanced support in the event of network problems, but it still doesn’t provide a way for developers to work seamlessly when the connection is down. Change reconciliation is not supported. Proxy servers are designed to be deployed per site but not per developer so roaming users working on laptops aren’t supported. Systems like Perforce, Team Foundation Server or Accurev are samples of this proxy based approach.
  • Mastership based multi-site: a server is installed at each remote location and changes are replicated back and forth with some restrictions. The replication unit is the branch, usually, and the strong restriction is that a site is set to be the owner of a given branch or set of branches at a time. Developers are free to make changes at their sites provided the site is in control of the branch they’re using. This way concurrent change conflicts are avoided because only one developer can modify a file or directory at the same branch at the same time on different sites, which greatly reduce the problem. It is an advantage over proxy based replication because it creates the illusion of unlimited changes at distant sites provided that a set of mastership (or ownership) rules are enforced. Teams at separate locations can work together at the same code base and set up their servers to be replicated regularly which enforces the sites have the right sources. Network problems are not an issue anymore as teams can continue working even when the connection is down. The downsides are: the replicated servers conceived to be deployed following a one per site strategy. Each developer can’t run his own server because they’re heavy in terms of resources so roaming developers are not yet supported. Clearcase Multi-site is a clear example of this multi-site approach.

Multi-site development

 

  • Unconstrained multi-site, also known as distributed: changes can be performed in parallel, even modifying the same element at the same time on different sites. The SCM server has to deal with change reconciliation, including merging. The mastership based multi-site can be thought as a simplified case of this scenario, so it is fully supported. Its main advantages are: as in mastership multi-site network is not a constraint anymore. The servers are light enough to support roaming developers because they can be deployed not only on powerful servers but also laptops. Full parallel development including unconstrained branching and merging is supported at all the sites concurrently even when no direct connection between them is available. Developers can make modifications on their site or even laptop repository replicas and push changes to the central or another peer server when required. BitKeeper and Plastic SCM are samples of distributed SCMs.

    Full DSD

    www.plasticscm.com



    Comments (0)Add Comment

    Write comment
    You must be logged in to a comment. Please register if you do not have an account yet.

    busy

    Get your FREE Subscription to Dr. Dobb’s Digest today!

    Look Who's Code Talking


    Chaim Krause
    City: Leavenworth

    Joel Salomon
    City: Brooklyn

    James McGuffee
    City: Austin

    Mike West
    City: Des Moines

    Justin Greenwood
    City: Carmel

    Massimiliano Pagani
    City: Gallarate

    Dobbs Code Talk Tags

    .NET abstraction Ada Adobe Agile Ajax algorithm Algorithmic complexity ALM Analogical reasoning Android Anecdotes Apple Application Development AppStore Architecture and Design ARM Artificial Intelligence Artificial Life Assembler Programming Audio files AVX AWK Banking Bazaar Best Practices Blender Books Brain computer interfacing Build C C Programming C Sharp Cartoon Category theory Cellular automata Clojure Cloud Computing Cobol Cocoa Coder Of The Month Cognition as compression Collaboration Common Process/Frameworks Compilers Computational humour Computational narrative Computational politics Computer Science Computers in art computing pioneers concurrency Conferences Consciousness research Contest Contest140 contests CPlusPlus crime CSharp D Programming Data Centers Databases Debugging Delphi Deployment design Design Patterns Digital Signal Processing Distributed Django Documentation DSL dynamic language Eclipse EDA education Emacs Embedded Systems Encryption engineering Erlang Etymology Excel exception handling Facebook Financial computing Five Questions Flash Flash Lite Flex Forth Fortran Fraud FreeBSD Fun Functional Programming gadgets Games Gender Git gnuplot Go Google Graphics GUI hardware Heron High School High-Performance Computing History Holographic reduced representations HTML5 Humanity Humour Hungarian Notation Identity Inkscape Innovation Intel Interview iPhone J2EE Java JavaFX JavaOne JavaScript language engineering Legal lex LINQ Linux Lisp Literate Programming Logic Programming m4 Mainframes Make Mathematica Mercurial Mesh messaging Metaprogramming Microsoft MID Miscellaneous Musings ML Mobile Software Mobility modeling modular programming multicore Music MVC myblog Natural Language Processing Networking Neural networks newspeak Nokia numerical computing Object Rexx ObjectiveC Office Office 2007 Online spreadsheets OOP Open Source Openaccess publishing OpenBSD OpenSolaris Operating Systems Optimization Oracle Pair Programming Parallelism Concurrency Parsing Pascal Patents Patterns Performance Perl PHP Podcast Pop11 Poplog Privacy Processing Productivity Programming Language Implementation Programming Language One Programming language semantics Programming Languages Programming Style Project Management Prolog Psychology Public understanding of science puzzle Python QA Quantum Computing Quotes Rails Realtime recls Requirements Research practice REST Review RIA rich internet applications Robotics Ruby SaaS Software as a service Scala Schadenfreude Science fiction Screencast Scripting SD Best Practices Search Security Semantic Web Silverlight Snobol SOA social Social Networks Society for the Study of Artificial Intelligence a Software Development Methodology and Management Songs and poems Spending Priorities Spreadsheets SQL Startups Statistics Storage String pattern matching Survey Teaching Testing The Business of Programming The Dobbs Challenge The Future Theory Topology Transhumanism Travel on the Job Twitter Types Unix Upgrade Usability Use Cases USENET User Experience User Interface Design Version Control video virtual machines Virtualization Visual Studio Visual Studio Sponsored Post WCF Web Development Windows Windows 7 Windows Live Wireless WOA WPF X Window System yacc

    Subscribe to Dr. Dobbs Newsletter

    Email:
    Dr. Dobb's Update
    Delivered twice a week, Dr. Dobb's Update provides unbiased and objective news, commentary and technical features spanning the entire software development marketplace.

    Latest Comments

    Jonathan's Last Day at Sun
    For the 8 years I worked there, it was fantastic. I worked there under McNealy and I have undying admiration for the guy. I only knew Jonathan periphe...
    Implementing Thread Local Storage on OS ...
    Back in the day, I did a fair amount of work with PThreads. Wonderful design. Some quirks, but basically really, really nice. Although I wrote a lot ...
    More Technonecrophilia with Snobol One-L...
    Yeah, It's probably identical except for the (embedded) copy number, I would think. Once it became freely distributable, the copy I've been distribut...
    More Technonecrophilia with Snobol One-L...
    There's a spitbol-3.7-win.exe at http://code.google.com/p/spitbol/downloads/list . I found it via Dave Shield's blog page http://daveshields.wordpress...
    Jonathan's Last Day at Sun
    Sadness.
    DDJ