Skip to main content.

Using git

Branch off from older commit

Ref:
http://www.gelato.unsw.edu.au/archives/git/0508/7636.html

Rebasing branches

Ref:
http://www.gelato.unsw.edu.au/archives/git/0508/7642.html

I had a handful commits that were ahead of master in pu, and I wanted to add some documentation bypassing my usual habit of placing new things in pu first. At the beginning, the commit ancestry graph looked like this:

                             *"pu" head
    master --> #1 --> #2 --> #3
So I started from master, made a bunch of edits, and committed:
    $ git checkout master
    $ cd Documentation; ed git.txt git-apply-patch-script.txt ...
    $ cd ..; git add Documentation/*.txt
    $ git commit -s -v
NOTE. The -v flag to commit is a handy way to make sure that your additions are not introducing bogusly formatted lines.

After the commit, the ancestry graph would look like this:

                              *"pu" head
    master^ --> #1 --> #2 --> #3
          \
            \---> master
The old master is now master^ (the first parent of the master). The new master commit holds my documentation updates.

Now I have to deal with "pu" branch.

This is the kind of situation I used to have all the time when Linus was the maintainer and I was a contributor, when you look at "master" branch being the "maintainer" branch, and "pu" branch being the "contributor" branch. Your work started at the tip of the "maintainer" branch some time ago, you made a lot of progress in the meantime, and now the maintainer branch has some other commits you do not have yet. And "git rebase" was written with the explicit purpose of helping to maintain branches like "pu". You could merge master to pu and keep going, but if you eventually want to cherrypick and merge some but not necessarily all changes back to the master branch, it often makes later operations for you easier if you rebase (i.e. carry forward your changes) "pu" rather than merge. So I ran "git rebase":

    $ git checkout pu
    $ git rebase master pu
What this does is to pick all the commits since the current branch (note that I now am on "pu" branch) forked from the master branch, and forward port these changes.
    master^ --> #1 --> #2 --> #3
          \                                  *"pu" head
            \---> master --> #1' --> #2' --> #3'
The diff between master^ and #1 is applied to master and committed to create #1' commit with the commit information (log, author and date) taken from commit #1. On top of that #2' and #3' commits are made similarly out of #2 and #3 commits.

Old #3 is not recorded in any of the .git/refs/heads/ file anymore, so after doing this you will have dangling commit if you ran fsck-cache, which is normal. After testing "pu", you can run "git prune" to get rid of those original three commits.

Cherrypicking using only the core GIT tools

Ref:
http://www.gelato.unsw.edu.au/archives/git/0508/7642.html

You cloned upstream repository and made a couple of commits on top of it.

                              *your "master" head
   upstream --> #1 --> #2 --> #3
You would want changes #2 and #3 incorporated in the upstream, while you feel that #1 may need further improvements. So you prepare #2 and #3 for e-mail submission.
    $ git format-patch master^^ master
This creates two files, 0001-XXXX.txt and 0002-XXXX.txt. Send them out "To: " your project maintainer and "Cc: " your mailing list. You could use contributed script git-send-email-script if your host has necessary perl modules for this, but your usual MUA would do as long as it does not corrupt whitespaces in the patch.

Then you would wait, and you find out that the upstream picked up your changes, along with other changes.

   where                      *your "master" head
  upstream --> #1 --> #2 --> #3
    used   \ 
   to be     \--> #A --> #2' --> #3' --> #B --> #C
                                                *upstream head
The two commits #2' and #3' in the above picture record the same changes your e-mail submission for #2 and #3 contained, but probably with the new sign-off line added by the upsteam maintainer and definitely with different committer and ancestry information, they are different objects from #2 and #3 commits.

You fetch from upstream, but not merge.

    $ git fetch upstream
This leaves the updated upstream head in .git/FETCH_HEAD but does not touch your .git/HEAD nor .git/refs/heads/master. You run "git rebase" now.
    $ git rebase FETCH_HEAD master
Earlier, I said that rebase applies all the commits from your branch on top of the upstream head. Well, I lied. "git rebase" is a bit smarter than that and notices that #2 and #3 need not be applied, so it only applies #1. The commit ancestry graph becomes something like this:
   where                     *your old "master" head
  upstream --> #1 --> #2 --> #3
    used   \                      your new "master" head*
   to be     \--> #A --> #2' --> #3' --> #B --> #C --> #1'
                                                *upstream
                                                head
Again, "git prune" would discard the disused commits #1-#3 and you continue on starting from the new "master" head, which is the #1' commit.

Linux subsystem maintenance using GIT

Ref:
http://www.gelato.unsw.edu.au/archives/git/0508/7697.html

My requirements here are to be able to create two public trees:

  1. A "test" tree into which patches are initially placed so that they can get some exposure when integrated with other ongoing development. This tree is available to Andrew for pulling into -mm whenever he wants.

  1. A "release" tree into which tested patches are moved for final sanity checking, and as a vehicle to send them upstream to Linus (by sending him a "please pull" request.)

Note that the period of time that each patch spends in the "test" tree is dependent on the complexity of the change. Since GIT does not support cherry picking, it is not practical to simply apply all patches to the test tree and then pull to the release tree as that would leave trivial patches blocked in the test tree waiting for complex changes to accumulate enough test time to graduate.

Back in the BitKeeper days I achieved this by creating small forests of temporary trees, one tree for each logical grouping of patches, and then pulling changes from these trees first to the test tree, and then to the release tree. At first I replicated this in GIT, but then I realised that I could so this far more efficiently using branches inside a single GIT repository.

So here is the step-by-step guide how this all works for me.

First create your work tree by cloning Linus's public tree:

 $ git clone rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git work
Change directory into the cloned tree you just created
 $ cd work
Make a GIT branch named "linus", and rename the "origin" branch as linus too:
 $ git checkout -b linus
 $ mv .git/branches/origin .git/branches/linus
The "linus" branch will be used to track the upstream kernel. To update it, you simply run:
 $ git checkout linus && git pull linus
you can do this frequently (as long as you don't have any uncommited work in your tree).

If you need to keep track of other public trees, you can add branches for them too:

 $ git checkout -b another linus
 $ echo URL-for-another-public-tree > .git/branches/another
Now create the branches in which you are going to work, these start out at the current tip of the linus branch.
 $ git checkout -b test linus
 $ git checkout -b release linus
These can be easily kept up to date by merging from the "linus" branch:
 $ git checkout test && git resolve test linus "Auto-update from upstream"
 $ git checkout release && git resolve release linus "Auto-update from upstream"
Set up so that you can push upstream to your public tree:
 $ echo master.kernel.org:/ftp/pub/scm/linux/kernel/git/aegl/linux-2.6.git > .git/branches/origin
and then push each of the test and release branches using:
 $ git push origin test
and
 $ git push origin release
Now to apply some patches from the community. Think of a short snappy name for a branch to hold this patch (or related group of patches), and create a new branch from the current tip of the linus branch:
 $ git checkout -b speed-up-spinlocks linus
Now you apply the patch(es), run some tests, and commit the change(s). If the patch is a multi-part series, then you should apply each as a separate commit to this branch.
 $ ... patch ... test  ... commit [ ... patch ... test ... commit ]*
When you are happy with the state of this change, you can pull it into the "test" branch in preparation to make it public:
 $ git checkout test && git resolve test speed-up-spinlocks "Pull speed-up-spinlock changes"
It is unlikely that you would have any conflicts here ... but you might if you spent a while on this step and had also pulled new versions from upstream.

Some time later when enough time has passed and testing done, you can pull the same branch into the "release" tree ready to go upstream. This is where you see the value of keeping each patch (or patch series) in its own branch. It means that the patches can be moved into the "release" tree in any order.

 $ git checkout release && git resolve release speed-up-spinlocks "Pull speed-up-spinlock changes"
After a while, you will have a number of branches, and despite the well chosen names you picked for each of them, you may forget what they are for, or what status they are in. To get a reminder of what changes are in a specific branch, use:
 $ git-whatchanged branchname ^linus | git-shortlog
To see whether it has already been merged into the test or release branches use:
 $ git-rev-list branchname ^test
or
 $ git-rev-list branchname ^release
[If this branch has not yet been merged you will see a set of SHA1 values for the commits, if it has been merged, then there will be no output]

Once a patch completes the great cycle (moving from test to release, then pulled by Linus, and finally coming back into your local "linus" branch) the branch for this change is no longer needed. You detect this when the output from:

 $ git-rev-list branchname ^linus
is empty. At this point the branch can be deleted:
 $ rm .git/refs/heads/branchname
To create diffstat and shortlog summaries of changes to include in a "please pull" request to Linus you can use:
 $ git-whatchanged -p release ^linus | diffstat -p1
and
 $ git-whatchanged release ^linus | git-shortlog

My GIT Day (Junio C Hamano)

Ref:
http://www.gelato.unsw.edu.au/archives/git/0508/7973.html

Note that the version of git on my $PATH is usually the one from the proposed updates branch, so some of the commands I use in the following text may not work for you unless you also have built "pu" one yourself.

I am planning to finish updating, testing and documenting what's in the current proposed updates branch, and have most of them graduate to the master branch over the weekend. I am aiming for doing the 0.99.5 on Wednesday next week.


I have the following heads all the time in my private repository:

    master   - the one to be pushed to public "master" branch
    pu      - master plus various proposed changes
    rc      - master plus minimum release engineering

    ko-master   - a copy of public "master" branch head
    ko-rc   - a copy of public "rc" branch head
My GIT day always starts with this command:
    $ git fetch ko
I have this in .git/remotes/ko:
    $ cat .git/remotes/ko
    URL: master.kernel.org:/pub/scm/git/git.git/
    Pull: master:ko-master rc:ko-rc
    Push: master pu rc
The Pull: line gives me the default set of s to give to the "git fetch" command. I slurp "master" and "rc" heads from the public repository at master.kernel.org, and fast-forward my ko-master and ko-rc branches with them. I do not touch these two branches in any way other than this "git fetch" fast forwarding.

I have a few "topic branches" in addition to the above; they change from time to time. Recently I've been looking at multi-head fetches, and that work is done in "mhf" branch. There also is a catch-all topic called "misc". I started them like this:

    $ git branch mhf master
    $ git branch misc master
The first thing I do during my GIT day is to process the patches I received via e-mail. I store them one topic per file in my working tree, like this:
    $ ls +*.txt
    +js-glossary.txt  +mc-mailinfo.txt
Depending on the quality of the patch, seriousness of the bug they fix, and the area of the code they touch, they either go directly to "master", "misc", or sent back to the sender, but the last one, luckily for me, rarely happens:
    $ git checkout master
    $ git applymbox -q ./+js-glossary.txt .git/info/signoff

    $ git checkout misc
    $ git applymbox -q ./+mc-mailinfo.txt please
The last parameter to the applymbox command is the name of a file that has my signoff message. The latest applymbox in the "pu" branch has a bit more useful extension to do the same thing as what "git commit" does.

At this point, I may push out the "master" branch (and nothing else), like this:

    $ git push ko master
This pushes only "master", ignoring the default s defined in the .git/remotes/ko file you saw earlier.

Once I am done with the outside patches, I go back to where I left off the previous day:

    $ git checkout mhf
And I check where my head is relative to the master:
    $ git show-branches master mhf
    ! [master] Yet another tweak
     * [mhf] Make git-fetch-script a bit more chatty.
    +  Yet another tweak
    +  Another tweak in Makefile
     + Make git-fetch-script a bit more chatty.
     + Update git-ls-remote-script
     + ...
     + Start adding the $GIT_DIR/remotes/ support.
    ++ [PATCH] Allow file removal when "git commit --all" is used.
The output from show-branches is a poor-man's gitk. The named branches are shown, and '+' sign in each column shows whether the commit is contained in each branch, and the output stops where all branches converge, or you hit ^C ;-).

If the mhf branch is way behind, I may choose to first rebase it, to clean up my history:

    $ git rebase master
    $ git show-branches master mhf
    ! [master] Yet another tweak
     * [mhf] Make git-fetch-script a bit more chatty.
     + Make git-fetch-script a bit more chatty.
     + Update git-ls-remote-script
     + ...
     + Start adding the $GIT_DIR/remotes/ support.
    ++ Yet another tweak
I keep working in my topic branches. I may make some other changes in "misc" topic branch. It's a simple cycle of:
    $ edit-and-test
    $ git commit -s -a -v
Eventually I get to a good point where it makes sense to push things to the public repository.
    $ git show-branches master mhf misc
    ! [master] Yet another tweak
     * [mhf] Make git-fetch-script a bit more chatty.
      ! [misc] Add hooks to tools/git-applypatch.
     +  Make git-fetch-script a bit more chatty.
     +  Update git-ls-remote-script
     +  ...
     +  Start adding the $GIT_DIR/remotes/ support.
    ++  Yet another tweak
    ++  Another tweak in Makefile
      + Add hooks to tools/git-applypatch.
      + Add commit hook and make the verification customizable.
    +++ [PATCH] Allow file removal when "git commit --all" is used.
As you may have noticed, my topic branches are private and not pushed to the public repository. Instead, I make a grand total merge of them into "pu". The proposed update branch is always rewound and made from the head of the master:
    $ git checkout pu
    $ git reset master
This checks out the head of "pu" branch, and then resets the index file to match "master" and updates .git/refs/heads/pu.

What it does not do is to update my working tree to match the index file. Linus recommends to do "git checkout -f" at this point, but I typically do this instead:

    $ git diff -R -p | git apply
Note. This is an embarrassingly expensive way; the only thing it buys me over "git checkout -f" is that it removes the files that were introduced in "pu" branch that did not exist in the "master" head. I have to come up with a not so expensive way to do this.

Then before doing the grand total merge, check again where those heads are:

    $ git show-branches master mhf misc pu 
    ! [master] Yet another tweak
     ! [mhf] Make git-fetch-script a bit more chatty.
      ! [misc] Add hooks to tools/git-applypatch.
       * [pu] Yet another tweak
     +   Make git-fetch-script a bit more chatty.
     +   Update git-ls-remote-script
     +   ...
     +   Start adding the $GIT_DIR/remotes/ support.
    ++ + Yet another tweak
    ++ + Another tweak in Makefile
      +  Add hooks to tools/git-applypatch.
      +  Add commit hook and make the verification customizable.
    ++++ [PATCH] Allow file removal when "git commit --all" is used.
Notice that I did not rebase "misc" above, but that is OK. What I want to do here is to make new "pu" an Octopus over "master", merging in all my topic branches (currently "mhf" and "misc").
    $ git fetch . mhf misc
    Packing 0 objects
    Unpacking 0 objects
    * committish: a101f32...e5580   refs/heads/mhf from .
    * committish: 4426ac7...0c5bc   refs/heads/misc from .
This fetches two heads from the current repository (!). The only reason I do it is that the tentative implementation of "git octopus" always reads from $GIT_DIR/FETCH_HEAD, and "git fetch" is the way to populate that file.
    $ git octopus
    Removing git-parse-remote
    Removing git-parse-remote
    Committed octopus merge fe1899156bffa4be6722b2ca0b74ff17b76523da
     Makefile                        |    3 -
     git-commit-script               |   75 +++++-------------
     ...
     tools/git-applypatch            |   87 ++++++++++++++++----
     15 files changed, 551 insertions(+), 224 deletions(-)
This makes an Octopus out of the master and other two topic branches. I can make sure that resulting "pu" contains all the necessary commits from the branches involved:
    $ git show-branches master mhf misc pu
    ! [master] Yet another tweak
     ! [mhf] Make git-fetch-script a bit more chatty.
      ! [misc] Add hooks to tools/git-applypatch.
       * [pu] Octopus merge of the following: 
       + Octopus merge of the following:
     + + Make git-fetch-script a bit more chatty.
     + + Update git-ls-remote-script
     + + ...
     + + Start adding the $GIT_DIR/remotes/ support.
    ++ + Yet another tweak
    ++ + Another tweak in Makefile
      ++ Add hooks to tools/git-applypatch.
      ++ Add commit hook and make the verification customizable.
    ++++ [PATCH] Allow file removal when "git commit --all" is used.
And things are now ready to be pushed out. First I sanity check the differences between ko-master and master (the earlier "git fetch ko" was done only for this step):
    $ git show-branches master ko-master
... and then run "git push":
    $ git push ko
This will push "master" and "rc" but would fail to push "pu", because that is rebased and not based on what is on the public repository. So I push once more, this time with --force, like this:
    $ git push --force ko pu
This pushes only "pu", ignoring the default s defined in the .git/remotes/ko file you saw earlier. After that, I go back to reading the mail and wait until the kernel.org mirror network catches up.