Mike MacDonagh's Blog

Somewhere in the overlap between software development, process improvement and psychology

Tag Archives: ivy

Working with Maven multi-module projects in RTC Jazz SCM with m2

I’ve been working with Maven projects with m2 for a while in RTC Jazz SCM so thought I’d post some notes. For reference this post covers using RTC v3.x things may change in the future. If you’re not using m2 then I’d suggest you read this blog by Philippe Krief.

First of all it’s important to understand that RTC v3 has limited support for Maven. You can setup a build definition which can call a maven goal (such as a local install) easily enough but that’s about it. The RTC build mechanism is basically a cron job with a nice web (like Hudson/Jenkins without the plugins) and Eclipse front end, it doesn’t have a deep understanding of Maven.

The whole point of Maven over Ant is that it does a lot by convention rather than specification, however in most advanced Maven project I’ve seen pom files are as big as equivalent Ant + Ivy files. Regardless I still think Maven with the excellent Sonatype Nexus is awesome (although I also like the structured bash/batch like nature of Ant as well).

Most serious Maven projects are multi-module. This is because component based development is a good thing (shameless employer plug, but it’s true, it is a good thing). Most (tending to all) multi-module projects have the following structure where “O” is a folder and “-” is a file, considering a hypothetical ProductX:

- pom.xml (the parent pom for the master/uber/product build, lists child modules)

- various files at the root of ProductX

O ModuleA

- pom.xml (for ModuleA)

- various ModuleA root files

O various ModuleA folders of stuff (such as src)

O Module B

- pom.xml (for ModuleB)

- various ModuleB root files

O various ModuleB folders of stuff

If you want to work with this stuff in RTC Jazz SCM then you need to balance the constraints of Maven m2, RTC and Eclipse. If you’re considering doing this then you’ll probably be considering moving existing Maven projects to RTC, in which case you may find a previous blog on a manual migration script useful. The only edit to this I make for Maven is to separately import the parent from the children if flattening the structure (although in most cases I wouldn’t recommend flattening).

The constraints playing against each other are:

  • Maven and plugins expect sub-modules to be in child directories as in the tree above
  • RTC doesn’t allow both an Eclipse project and root files to be at the root of a component
  • The simple case of Eclipse and RTC SCM interacting during a load from the server restricts Eclipse to only seeing projects which are represented as folders at the root of the local Eclipse workspace containing .project files.
  • Eclipse doesn’t really support parent and child projects.

I’ve found there are a number of approaches, none are perfect:

1. Flatten the structure (my opinion the worst option)

You can edit the parent pom.xml to reference the sub-modules as ../ModuleA etc. although the Maven release plugin (amongst others) don’t play well with this. Also, and somewhat fundamentally, you shouldn’t have to restructure your code based on your SCM tool.

2. Only load the relevant bits (my opinion is it’s sometimes ok)

You could just load the child modules you’re currently working on and then unload (remember to delete local files or you hit a bug) and then reload the parent when you want to do a “big” parent build. Frankly this is unacceptable from a developers perspective and will fail anyway if there’s a dependency on the parent from the children. Although in theory this may be frowned upon it’s quite common and so needs supporting by the scm tool.

3. Load then import, with auto-refresh (my preferred option)

This one is going to be more than one paragraph….

Prediction fulfilled ;p Anyway… Currently your developers may be used to a scm system where they load/copy/access files from the scm system on the local filesystem and then do an Eclipse -> Import Existing Maven projects to get all of the various goodies.

If you import the multi-module project as above but with a root folder (as an eclipse project) so it has the following structure:

O Product X

.project

- pom.xml modules)

- various files at the root of ProductX

O ModuleA

- pom.xml (for ModuleA)

- various ModuleA root files

O various ModuleA folders of stuff

O Module B

- pom.xml (for ModuleB)

- various ModuleB root files

O various ModuleB folders of stuff

Then when you do a load from Jazz SCM you’ll get in the Eclipse Project Explorer (or equivalent) ProductX correctly detected as a Maven project (or an Eclipse project with a Maven nature for the pedantic amongst you).

You can then do a normal Eclipse -> Import Existing Maven Projects based on your workspace/ProductX/… (taking care not to select “Copy projects into workspace”) You’ll then get in Project Explorer (or whatever):

ProductX (info about rtc scm links)

ModuleA

ModuleB

In this view ModuleA and ModuleB exist in two places: as projects referenced in the workspace but apparently outside of RTC SCM control; as a folder structure under ProductX/ModuleA and ProductX/ModuleB.

This has the following advantages:

  • Cross-project dependencies resolve
  • You can run the parent/uber pom goals and the child pom goals without reloading anything
  • You can shift focus between the child-modules and parent structure without reloading anything

If you edit files in ModuleA and ModuleB under the folder hierarchy of ProductX then Pending Changes will keep track in the normal way but if you edit a file under the apparent “root level” ModuleA and ModuleB then Pending Changes won’t unless you do a “deep refresh”.

Alternatively just go to Eclipse -> Window -> Preferences ->  General -> Workspaces -> Refresh automatically (er… I think, I’m doing this from memory, hence the lack of screenshots). You can then add “edit files tracked by Pending Changes at either Product or child module level” to the advantages above.

So this option sounds great so far, what are the downsides?

This solution maintains the hierarchical structure between ProductX and it’s child modules and as such constrains everything from ProductX downwards to be in the same Jazz SCM component as eclipse projects can only be seen (without far too much fiddling) if they’re root folders in components. This may not seem like such a big deal but components are the lowest level of baseline (or label) in Jazz SCM. In a tightly coupled cohesive single architecture this might not be a problem but if you want to reuse any of these modules (at a code level) or have a separate ownership it will be a problem. It’s also very common to have dependencies from child modules to parent projects (which may initially seem to resolve if they’re in the local ~/.m2 repository).

Having said that source code dependencies between components lead to brittle architectures, binary dependencies are far superior, and made much easier by the use of things like Ivy/Maven and Nexus.

Other considerations

The point above above binary vs. code dependencies can be argued until the cows come home and go out again (an extended metaphor), especially if you consider project/product/team/release boundaries. I’m not going to pretend to answer those with an academic silver bullet here, the important thing I hope you take away from this blog is that there are pros and cons to all of these approaches, balancing those with the needs of your team/architecture/project/organisation are not easy. Shameless plug: Give me a call if you need help

Eclipse refactoring isn’t dealt with well by Jazz SCM, especially between Eclipse projects in and “out” of source control (advocated by option 3 above). Expect a lot of “remove/add” combinations if you are doing anything more than a trivial set of file changes. This is also a problem for the ClearCase <-> RTC automatic synchroniser developed by the excellent Samecs however I know that I, others and Samecs are pushing for improvements on this from IBM.

This post has been very scm focussed but what about build and release? The RTC build engine understands maven to a limited degree and can easily invoke maven builds for information on integating the maven release process see How to do maven releases with Jazz SCM

Automatic bidirectional synchronisation between RTC and ClearCase

I’ve previously blogged on a manual ClearCase to RTC migration script I use during migration from IBM Rational ClearCase to IBM RTC. However sometimes it’s necessary to consider automatic synchronisation for the following reasons:

  • Partial team adoption in large teams/organisations
  • There are code dependencies between teams (you’d be better off breaking them to binary dependencies and using something like maven or ivy though…)
  • Support teams on RTC SCM while interacting with content from ClearCase multisite locations
  • Maintaining complex builds that are dependant on ClearCase (i.e. using ClearMake).
  • Allowing teams to pilot RTC SCM without working about branching between tools
  • …. and many more…

The off the shelf offerings to bridge between RTC/ClearCase are a bit limited due to the relationship between eclipse projects, rtc components and streams. It’s necessary to have eclipse projects at the root of a component for them to load smoothly as proper projects in eclipse with their correct natures, so any serious synchronisation solution needs to allow you to pick and choose parts of a folder structure in ClearCase and sync them to projects in components in RTC. The excellent open source RTC ClearCase workspace synchroniser from samecs does exactly this.

I personally use a combination of the manual migration script (mentioned at the top) and the RTC ClearCase workspace synchroniser to deal with team migrations from ClearCase to RTC. Generally my preference is to do it manually as synchronisation is a tactical solution during rollout and introduces a fair amount of technical complexity, but in cases with the features mentioned above sometimes you need a powerful and flexible synchronisation solution.

How to migrate source to Jazz RTC SCM

First of all, although it might seem tempting to seperate scm and build, especially from a business change and planning perspective you just can’t (unless you’re writing scripting languages!). Software is just useless text if it can’t be built/run and a software team can’t be expected to move where they store their source without ensuring that build doesn’t get broken. In a perfect world build scripts would be generic and work regardless of the scm system they’re coming from but that generally isn’t the case due to this not being a perfect world:

  • different scm systems/ides imposing different structural constraints
  • absolute paths to files
  • dodgy use of symlinks
  • lack of parameterization of build scripts
  • complexity of numerous build technologies

This means that when migrating a team into Jazz SCM you need to at least ensure that their current builds work in their current normal way. Ideally it would be great to (re)engineer them to word with RTC Build but that isn’t always practical. This build refactoring shouldn’t be underestimated as it can take a while because of general build complexity, project complexity and technology complexity. It’s not unusual to find in just one organisation the use of: ant, maven, junit, ivy, make, clearmake, msbuild sometimes all on the same “project”. Add unit testing, style checking, static analysis and other things that people tend to add into the build process and you’re looking at a large amount of technical skill needed to even begin to make sense of what’s currently going on.

For this reason when moving a team from cm system X to RTC Jazz SCM I find it necessary to build the stream in 2 parts, which means you need a repeatable migration script that doesn’t get tripped up by things like permission or line end delimiter platform differences.

My general migration script (assuming not synchronizing or bridging):

  1.     Label/Baseline/Shapshot source files in old scm system
  2.     Using linux copy source files from existing scm/view/folder to a holding area
  3.     chown if necessary and chmod files to u:rw(x),o—,g—
  4.     I tend to do a dos2unix conversion on appropriate files to normalise mixed mode files
  5.     Share to RTC
  6.     Baseline/Snapshot in RTC
  7.     Make builds work (generally a process of changing absolute path references to local relative paths – often “/view/blah/…/proj” to “../” (here be technical complexity)
  8.     Baseline/Snapshot in RTC (“Initial scm migration with working build”)

There is a great advantage in following this process (which is why I’ve scripted steps 1-4 although unfortunately I can’t share that) somewhat slavishly in that it is repeatable to update a file set as a change set. If you do this to create an example stream for a team to experiment and gain confidence in while they’re still developing in their existing system, then when they choose to adopt you can simply update their stream(s) by:

  1.     [Optionally] revert to working build snapshot “Initial scm migration with working build” from step 8 above
  2.     Repeat steps 1-4 above
  3.     Copy files in holding area to RTC workspace
  4.     In RTC Pending Changes view drop-down the refresh button to select “Refresh Sandbox and Remote Changes”

You then get a change set showing the file changes since the experimental migration and the present time, simply undo or merge changes to build files to retain local workspace references and then deliver to update RTC using an atomic change set(s) :)

Follow

Get every new post delivered to your Inbox.

Join 345 other followers

%d bloggers like this: