Saturday, May 17, 2014

Keeping Logical Commits together with Git

Overview

I have developed a pet peeve about having code for one feature (that belongs together) scattered in the git history. I have tried to look at the code for a single feature (that had many commits as the feature's coding proceeded), and found it close to impossible to determine which commits belong to that feature (when work on more than one feature are worked on at the same time). The history is in commit sequence, so if several feature have been worked on simultaneously, the commits are inter-mixed in the listing.

Example diagram (before merge):
... --A--B-*-D--F (master or development branch)
                  \
                   C--E  (feature branch)

After regular merge:
... --A--B--C--D--E--F (master or development branch after merge)

Notice that our feature commits C and E are intermixed with commits from other features.

This is a set of procedures for handling a git repository that will allow developers to keep all feature code together then the feature branch is merged with the develop (and hence also the master) branch.

I keep the master branch to correspond with code that is in production. I keep a parallel branch to master named 'dev' or 'develop', which will contain all features (and fixes) that have been reviewed and are ready for final testing and move to production on the master branch. When starting a new feature, I create a new feature branches created off of the development branch. When working on a feature, I don't have to worry about any other code in other branches or code that has been already placed in the development branch or master branch.  This is all pretty standard when working with git.

Once a feature branch is ready to be merged into the development branch the developer will rebase the feature branch commits to the head of the development branch. This is a simple and safe process, which is the crux of keeping the task commits together. In other words, the task branch which has a base anywhere in the development branch history, is moved the top of the development branch so it appears that all of the commits happened just now.

Commits Kept Together (initial):
... --A--B-*-D--F (dev branch)
                  \
                   C--E  (feature branch)

after rebase:
... --A--B-*-D--F (dev branch)
                  \         \
                    \        G--H  (rebased feature branch)
                      \
                       C--E  (original feature branch)

After merge of rebased branch:
... --A--B--D--F--G--H (dev branch after merge of rebased branch)

I used to advocate for also doing merging all of the commits into one big commit (G and H would be 'squashed' together into one commit I).  I found that procedure tricky and not worth the risk, as seeing them separately, but sequentially is normally good if the commits are done regularly.

The following procedure assumes that there is a repository on a remote server (such as GitHub) identified with the remote name 'origin', with a master branch and a dev (reviewed development features) branch.  All feature branches should be pushed up to the remote server for save keeping and for review processes.  It is also assumed that the developer will be working on a local copy of the repository, copied down from the remote server. It is also assumed that only one developer will be working on a feature branch at a time (though more than one can be handled with some adjustments).

1) First set up your feature branch locally:

  • git fetch origin (download all updates from public repository - no merging
  • git checkout dev (go to dev branch)
  • git pull (update the dev branch with all of the latest changes)
  • git checkout -b <FeatureBranch> (create a new branch)
  • git push -u origin <FeatureBranch> (put branch on origin and start tracking it)

2) Regularly commit your code and save to origin (try to code in small related aspects of the feature):

  • git checkout <FeatureBranch>
  • (do your coding here)
  • git add --all . (stage all your code changes for commit - modify this as necessary)
  • git commit -m "commit message describing what has been done this commit"
  • git push origin <FeatureBranch> (push your commit to origin)

3) Get ready for code review.

  • git checkout <FeatureBranch>
  • git checkout -b <FeatureBranch>R1 (create and go to review # 1 branch)
  • git rebase dev (put these changes at the head of the dev branch)
  • (if any conflicts occur fix then do git rebase --continue till no more conflicts)
  • (run all automated tests now - Do any fixes as necessary - follow step 2 using <FeatureBranch>R1)
  • git push -u origin <FeatureBranch>R1

4) Do Code Review.

  • Create a pull request between the dev branch and the <FeatureBranch>R1. Note it is possible to attach an issue to a pull request.
  • Let reviewer(s) the pull request
  • If all is good, go to step 5, else go back to step 2 and do your changes.

5) Merge (fast forward only) the <FeatureBranch>Rn into dev branch.

  • git checkout dev
  • git merge --ff-only <FeatureBranch>Rn
  • (If a fast forward merge cannot be done, then someone else has put a commit in the dev branch. If this happens, the review process should be looked at to prevent two reviews at once. Go back to Step 3 and increment the feature branch number.)

6) Run tests on the dev branch.

7) Merge (fast forward only) the dev branch into the master branch.

  • git checkout master
  • git merge --ff-only dev

Issues

If there are any issues with this procedure, please let me know.

thanks,
Dave Taylor