I've won the task of restructuring/recreating an existing code repository, either using Git or Subversion. The repository history is not necessarily important in this special case. After analyzing the situation I've found some problems determining a good layout. I've read through a lot of blogs and threads, but I'm still unsure as to what is the best layout.
The existing repository contains a set of include files, a set of libraries that are partially dependent on each other, and many of them depend on the set of include files. Additionally, there are two application projects that depend on the set of libraries. Furthermore, there is a set of scripts that makes use of one of the applications and some additional configuration information. I've drawn a graph to clarify the situation:
    +---------->include files
    |                ^
    |                |
library A -----> library B <----- library C <----- library D
  ^  ^                                |                ^
  |  |                                |                |
  |  +--------------------------------+                |
  |                                                    |
application 1        application 2 --------------------+
  ^
  |
script -----> configuration information
The goal is to have a layout where each component can be developed as independently as possible, and to have a release (for external customers) that contains a set of all components at defined tag versions, so that it is possible to go back in time and build the software for a specific release.
I've come up with the following structure:
trunk/
  include/
  lib/
    library A/
    library B/
    library C/
    library D/
  app/
    application 1/
    application 2/
tags/
  1.0/
    include/
    lib/
      library A/
      library B/
      library C/
      library D/
    app/
      application 1/
      application 2/
  1.1/
    include/
    lib/
      library A/
      library B/
      library C/
      library D/
    app/
      application 1/
      application 2/
  ...
Each time I create a new release I would simply copy the whole repository to a new subdirectory in tags.
The problem with this solution is that the libraries do not have separate tag directories for themselves, and that I only want to have a release that consists of tagged components, and this solution does not display which components have which tag versions in a release. I have considered using separate repositories and then create a master repository that has a releases subdirectory where I link in all necessary components with `svn:externals' and a specific tags subdirectory, but the different libraries and include files depend on each other and I don't see how to partition the code into separate entities.
Any ideas?
=============== question continued on 28-1-2011 ===============
Ok, I've drawn a graph of how I plan the new layout. The goal is to link the tags of various dependencies with the svn:externals method within one repository, for example I'd set svn:externals in trunk/projects/lib/library2/dependencies on ^/tags/projects/include/std/1.3.
trunk/
  projects/
    include/
      std/
    lib/
      library1/
        dependencies/
          std/ --> tags/projects/include/std/1.2
      library2/
        dependencies/
          std/ --> tags/projects/include/std/1.2
          library1/ --> tags/projects/lib/library1/1.4.3
      library3/
        dependencies/
          std/ --> tags/projects/include/std/1.3
          library1/ --> tags/projects/lib/library1/1.4
    app/
      application1/
        dependencies/
          library3/ --> tags/projects/lib/library3/1.1
      application2/
        dependencies/
          library1/ --> tags/projects/lib/library1/2.1
      application3/
        dependencies/
          std/ --> tags/projects/include/std/1.2
          library2/ --> tags/projects/lib/library2/1.5.2
    config/
      configuration1/
        dependencies/
          application1/ --> tags/projects/app/application1/2.3
      configuration2/
        dependencies/
          application1/ --> tags/projects/app/application1/1.6
      configuration2/
        dependencies/
          application2/ --> tags/projects/app/application1/1.6
tags/
  projects/
    include/
      std/
        1.2/
        1.3/
    lib/
      library1/
        1.4.3/
        1.4/
        2.1/
      library2/
        1.5.2/
      library3/
        1.1/
    app/
      application1/
        1.6/
        2.3/
branches/
  ...
Remaining questions:
I suggest turning the taxonomy inside out. In subversion, I suggest a taxonomy like this:
companyname/
  include/
    trunk/
    tags/
    branches/
  libraryA/
    trunk/
    tags/
    branches/
  libraryB/
    trunk/
    tags/
    branches/
  libraryC
    trunk/
    tags/
    branches/
  libraryD/
    trunk/
    tags/
    branches/
  application1
    trunk/
    tags/
    branches/
  application2
    trunk/
    tags/
    branches/
In git, I suggest you should create a separate git repo for include, libraryA, libraryB, application1, etc...
This structure will allow you to create any kind of dependencies between the different parts (for example, a branch in application1 could depend on an unstable version of the libraryA project, while the HEAD in application1 could depend on a stable version of the libraryA project).
This structure also works well with most build tools like maven, rake, buildr, ant, etc.
The taxonomy you presented looks like it's a good structure for a deployed version of your application, but not a good structure for version control. From experience, I think you will be better off if you use a structure like the one I suggested for version control, and then use a build script (or build tool) to create the structure you listed when it comes time to package/deploy/ship your app.
UPDATE: To elaborate a bit on how work cycle might go:
So, for example, let's say we finished implementing bug fixes for Application1 (let's call this version 1.0.0). The latest and greatest changes are checked into application1/trunk. This version of Application1 depends on libraryD v0.5.0. You update application1/trunk/README.txt with a note that this version depends on libraryD v0.5.0. Perhaps, more importantly, the build script inside application/trunk knows how to checkout libraryD/tags/0.5.0. Next, create a tag (which is simply a copy of trunk in the current state) in application1/tags/1.0.0.
Now, let's say a week goes by and another developer updates libraryD to version 1.3.0. You need to enhance application1. So, make changes in application1/trunk. Then update application1/trunk/README.txt to say that you now depend on libraryD v1.3.0 (and, similarly, the new build script for application1 v1.3.0 will checkout libraryD/tags/1.3.0). Copy application1/trunk to application1/tags/1.1.0.
Now you can always revert to application1/tags/1.0.0 if needed (and be confident that it will pull code from libraryD/tags/0.5.0. application/tags/1.1.0 will use libraryD version 1.3.0.
In both subversion and git, a tag is a reference back to a set of files at a given point in time. This means that tags don't take up much space, so I say tag early and often ;-)
An important question is if all libraries and applications are tight coupled into a single product, or if they can live independent from each other. IMO Putting them into a single git repository only makes sense, if all libs and apps are one single product, since other than svn you can't check out only a part of a git tree. When the libs and apps are independent, you can create a repo for each lib/app, and glue them together via submodules.
Git submodules are similar to svn extarnals, but they reference always a specific revision of the target, never only a plain url. So when you run git checkout deadbeef, you always get the submodule state from the point when commit deadbeef was created, regardless which is the head of the referenced repo. So a git tag also takes the state of each submodule, unlike svn with a not revision-pinned url.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With