Rebasing Merge Commits in Git

 45   Tweet
! Please note that this post has not been updated in over 2 years and some content may no longer be relevant.

I’m one of the devs here at Envato, and this is my first post to the Notes blog. Having found myself in a role I would cautiously describe as ‘resident Git expert’, it’s only fitting that my first post would be about a fairly technical aspect of working with Git in a team environment.

The TL;DR version is this: When rebasing, always use the -p flag,

First, though, a small diversion – why rebase is part of my normal git workflow.

Why I starting using pull –rebase

Using git pull --rebase is becoming more popular to avoid unnecessary merge commits when fetching the latest code from master. There are a few blog posts on the matter, such as [1] [2]

I’ll give a brief summary of why this convinced me. For me, it boils down to two simple cases:

1. You haven’t made any changes to your branch

In this case, pull and pull --rebase will simply fast-forward. No problems.

2. You’ve got one or two small changes you forgot to push

In this case, the default pull will actually merge the remote changes into your branch, making a merge commit. This is bad for a couple of reasons, messiness is one, but I actually consider the problems it causes for git bisect more compelling (I must remember to write about that one day).

With git pull --rebase, you simply replay those commits on top of the new head. Now, if you push, you have linear history, rather than a divergence/merge. I think this is a better result. Usually, I follow the ‘always work on a branch’ and ‘merges are meaningful and good’ practice (partly inspired by [3]), but there’s no semantic difference between master and origin/master, so linear history makes sense.

So in general, git pull --rebase is better than a git pull. To make it the default, see [4].

There is one major problem with it, though – merge commits.

Rebasing deletes merge commits

This is best explained with an example

First, one that doesn’t fail

Given this simple repo:

Initial state

[master] git pull --rebase
First, rewinding head to replay your work on top of it...
Applying: little fix
Applying: forgot to push this

After pull

[master] git push
Counting objects: 6, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (5/5), 526 bytes, done.
Total 5 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (5/5), done.
To /Users/glen/envato/demo-origin
   f9c3cb8..e4a2e92  master -> master

After push

Linear history, just as we wanted!

All aboard the failboat

Say you’ve been working on your little feature for a while, like this:

Initial state

Then you merge to master (using --no-ff, of course [3])

[master] git merge --no-ff feature
Merge made by recursive.
 b |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)
 create mode 100644 b

Merged

Then you go to push, but somebody got in there first (origin/master has moved on)

[master] git push
To /Users/glen/envato/demo-origin
 ! [rejected]        master -> master (non-fast-forward)
error: failed to push some refs to '/Users/glen/envato/demo-origin'
To prevent you from losing history, non-fast-forward updates were rejected
Merge the remote changes (e.g. 'git pull') before pushing again.  See the
'Note about fast-forwards' section of 'git push --help' for details.

Of course, trying to push hasn’t updated our reference to origin/master, we need to git fetch to see the full picture

[master] git fetch
remote: Counting objects: 5, done.
remote: Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From /Users/glen/envato/demo-origin
   49ab1cf..9f3e34d  master     -> origin/master

Fetched
Ah, yes. Someone has pushed a commit ‘sneaky extra commit’ before we were able to push our commit (merging in of branch feature). So, we would normally just git pull --rebase to get ready to push, but if we do that, the merge commit gets deleted!

[master] git pull --rebase
First, rewinding head to replay your work on top of it...
Applying: my work
Applying: my work
Applying: my work

Doom

Our merge commit has disappeared!

This is bad for a whole lot of reasons. For one, the feature commits are actually duplicated, when really I only wanted to rebase the merge. If you later merge the feature branch in again, both commits will be in the history of master. And origin/feature, which supposed to be finished and in master, is left dangling. Unlike the awesome history that you get from following a good branching/merging model, you’ve actually got misleading history.

For example, if someone looks at the branches on origin, it’ll appear that origin/feature hasn’t been merged into master, even though it has! Which can cause all kinds of problems if that person then does a deploy. It’s just bad news all round.

Worst of all, you did everything ‘right’. You used merge --no-ff and git pull --rebase. Sad face.

In case it’s not obvious, this is what we wanted to happen:

Ideal outcome

You can recover from this situation (if you discover it before you push) by resetting and redoing the merge:

[master] git reset --hard origin/master
HEAD is now at 9f3e34d sneaky extra commit
[master] git merge --no-ff feature
Merge made by recursive.
 b |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)
 create mode 100644 b

The solution!

In the manpage for git-rebase

-p
--preserve-merges
Instead of ignoring merges, try to recreate them.

This uses the --interactive machinery internally, but combining it with the --interactive option explicitly
is generally not a good idea unless you know what you are doing (see BUGS below).

Or, to put it another way:

AWESOME

So, instead of using git pull --rebase, use a git fetch origin and git rebase -p origin/master:

[master] git fetch
remote: Counting objects: 5, done.
remote: Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From /Users/glen/envato/demo-origin
   49ab1cf..9f3e34d  master     -> origin/master

Fetched

[master] git rebase -p origin/master
Successfully rebased and updated refs/heads/master.

Ideal outcome
Win!

Downsides

Git pull is dead

The -p flag doesn’t apply to git pull --rebase, so you have to start explicitly fetching and rebasing. To be honest, I think this is more an upside. Fetching explicitly is good, since it refreshes your entire copy of the remote, and lists what branches have moved on (handy on a fast-moving project). But for those used to a single-step pull, this is slightly more work.

ORIG_HEAD is no longer preserved

ORIG_HEAD, once you get used to using it, is really handy to undo a destructive operation. Sadly, git rebase -p sets ORIG_HEAD for each commit being rebased, so you can’t use it to quickly return to the start of a rebase, something I ran in to working on this post.

Branch tracking is not used

Unlike git pull --rebase, which will fetch changes from the branch your current branch is tracking, git rebase -p doesn’t have a sensible default to work from. You have to give it a branch to rebase onto. With a good alias, however, that can be made painless.

Aliases

So, how about some aliases to make this all idiot-proof? I’ve decided to call mine gup (I’ve taken to calling it gee-up), and I’ve got it in a gist for bash, fish and zsh here. I’ve also included my gpthis alias for pushing without branch tracking.

gup will do a fetch of origin and rebase -p of the branch on origin with the same name as the current branch. For 99% of cases, this is exactly what I want.

Conclusion

This morning, I had no idea about the --preserve-merges flag on rebase, and was about ready to cry foul on using rebase at all, considering how bad this problem can get on a big project. But, as with everything Git, once you understand it a bit better, there’s usually a more complex way that sucks a whole lot less. Which is why aliases like gup are really handy – you can keep changing what your aliases mean without having to learn a new habit.

I’d welcome comments and suggestions, you can either reply here or hit me up on twitter – @glenmaddern

  • http://darwinweb.net/ Gabe da Silveira

    You mentioned you need to write about the problems with merge commits and git-bisect. That was my primary point at http://darwinweb.net/articles/the-case-for-git-rebase

    FWIW, I’ve come to the conclusion that I prefer rebasing whenever convenient (ie. for topic branches too) because if a bug shows up later, I’d rather have git-bisect show me exactly where it is introduced. The fact that some series of commits was once a topic branch is obfuscated by rebasing, but there are still clues in the dates of the commits. It’s true that this may make it harder to piece together developer intent after the fact, but in practice I haven’t found it worth it to perform the mental exercise of reconciling a bunch of branches to figure out what someone was thinking.

    • http://www.marnen.org Marnen Laibow-Koser

      Git bisect shows you exactly where a bug was introduced even with merge commits — at least, it’s always done so for me.

    • http://darwinweb.net Gabe da Silveira

      @Marnen, no it doesn’t. If you have a bug caused by the combination of changes in two branches, git-bisect will indicate the merge commit as having caused the bug which is not as fine-grained as indicating one or the other commits. Rebasing is basically finer-grained conflict resolution. Go read my article.

    • http://www.marnen.org Marnen Laibow-Koser

      I’ll run some tests, but I think that either you’re wrong or there’s something wrong with your Git setup. If git bisect sees a merge commit, it looks for a common merge ancestor and tests on all branches. I don’t think I’ve ever seen it mark a merge commit as bad. And that’s as it should be.

    • http://sfskaran.github.com/ Sathya Sekaran

      Merge commits can indeed be marked as bad by git bisect when they contain changes within them (a.k.a. conflict merge commits).

    • http://www.marnen.org Marnen Laibow-Koser

      Sathya: Good point. I’ve never had that happen, but that seems like the correct behavior. Note that in this case, I think rebasing would simply sweep the problem under the rug, not actually help: the problem is really that someone did a bad merge resolution, and if you’re rebasing, you don’t even know that that happened.

    • http://darwinweb.net Gabe da Silveira

      @Marnen – No, it doesn’t sweep the problem under the rug, what it does is give you finer grained visibility into the cause of the bug. The case we are talking about, both branches are independently correct, but together they introduce a bug. That means the bug is introduced by whoever is doing the merge/rebase. If a merge is done then git bisect will correctly point to the merge commit as being the first commit where the bug occurred, but the merge commit may have hundreds or even thousands of changes in the case of large topic branches. If a rebase is done from one branch onto the other (it doesn’t matter which way), then you will see precisely in one of the original commits where it was incompatible with the other branch. This is far more useful than preserving the original branch/merge structure because you still have all the original commits and their timestamps which allows you to infer most anything you could want to know about the state of the tree and what the developer was thinking at the time. I’ve spent several years using both styles, and I can tell you that I’ve never been at a loss for information in rebased history; the main place to avoid rebasing is on published commits because it creates hassles for other developers.

    • http://www.marnen.org Marnen Laibow-Koser

      Gabe:

      > If a rebase is done from one branch onto the other (it doesn’t matter which way), then you will see precisely in one of the original commits where it was incompatible with the other branch.

      True, but at such a high cost that I can’t recommend it as a good practice. The reason: you can’t back out a rebase easily, whereas you can back out a merge — and if the two branches are incompatible, you will need to back out one or the other. Also, rebasing camouflages the fact that these were two independent branches, and IMHO inappropriately serializes the history. This is important because if the two branches are incompatible, then to resolve the problem, it’s absolutely crucial to know where one branch ends and the other begins. By rebasing, you’ve lost that information, probably forever. (One of Git’s strengths it its ability to correctly represent and analyze nonlinear history — and rebasing largely breaks that.)

      In your scenario, each branch is correct as it stands. The problem comes from the interaction of the two branches…

      > This is far more useful than preserving the original branch/merge structure because you still have all the original commits and their timestamps which allows you to infer most anything you could want to know about the state of the tree and what the developer was thinking at the time.

      No, I think you’ve got it backwards. Why “infer” when you could have Git do the work for you? I want the branch structure to show me the state of the tree — I don’t want to have to go looking at timestamps.

      One of my cardinal rules in dealing with Git: Git is smart. Respect its intelligence. Don’t make it stupid. If you have to flatten and then mentally reconstruct the branch structure, then I think you’re making Git stupid — and working too hard yourself.

    • http://darwinweb.net Gabe da Silveira

      @Marnen – It’s clear that you did not read my article or you did not grok it if you did. I’ve been using git extensively for hundreds of projects over the past 6 years, including a 5-year, 100,000 LOC project with both merge and rebase workflows. I’ve put in the time and practice to experience what the real world tradeoffs are, something that based on the ignorance displayed in some of your comments you have not done. I’ll reply point by point to try to give a clear picture of my position, but I’m dubious whether you will really hear what I’m saying…

      > you can’t back out a rebase easily, whereas you can back out a merge

      This is nonsense. If you need to back out a merge or a rebase it is merely a different set of commands, and if you’ve pushed in either case you are rewriting history. You can select the original branch commits for rebasing just fine by referring directly to their sha1 range. Having an explicit branch to look at is slightly more visually obvious, but in practice you can always tell topic branches from the timestamp discontinuities. If you’re rewriting history you have to be looking carefully at what you’re doing anyway because there’s no guarantee that a critical bug fix wasn’t included in the topic branch.

      > Also, rebasing camouflages the fact that these were two independent branches, and IMHO inappropriately serializes the history.

      Branches in git are so lightweight that they function perfectly well as ad-hoc organization for a single developer. Once you embrace the ease of moving commits around in git you, the sanctity of a branch starts to evaporate. For instance, I often begin development of a feature on master locally, then get interrupted for a bug fix or something, at which point I checkout a new branch, reset –hard master to origin/master, commit my fix, and then rebase the topic branch. None of the rest of the team even knows I did this, and if there hadn’t been a bug to fix, this all would have occurred on master. The fact that I created a branch at some point does not matter.

      > This is important because if the two branches are incompatible, then to resolve the problem, it’s absolutely crucial to know where one branch ends and the other begins.

      No it’s not, you’re begging the question. It’s perfectly adequate to see the sequence of changes as applied to the working software and the times at which those commits were authored. That is plenty of information to reconstruct the mindset of the developer at the time. Having the exact state of the repository is of dubious value because neither you the debugger nor the developer at the time have a perfect mental model of the entire project anyway. Analyzing the effect of changes is far more effective then recreating prior states perfectly.

      > By rebasing, you’ve lost that information, probably forever. (One of Git’s strengths it its ability to correctly represent and analyze nonlinear history — and rebasing largely breaks that.)

      There’s nothing wrong with losing some amount of history. Rebasing does not throw away any actual developer commits, and you don’t need to save every bump and wart along the way. Another, even more clear cut case where rebasing is beneficial is where you make a commit but make a typo that breaks the build or have a failing test that you missed from some unexpected area. These types of things can break git-bisect and they are just noise. Cleaning up small errors is a no-brainer for increasing the signal to noise ratio in the repo history.

      The purpose of version control is a well-organized and navigable history of changes to a project. Ideally you want small atomic commits with very clear and apt commit messages describing the purpose and content of each commit. Git’s power to rewrite history, when used judiciously can allow you to approach this ideal without fretting over each and every commit as you make it. When you truly grok git it lays bare the arbitrary nature of the decision to commit or to branch, and it opens up possibilities for optimizing the final recorded history to be as clear as possible.

      > No, I think you’ve got it backwards. Why “infer” when you could have Git do the work for you?

      Because you lose bisectability. Again, *go read my blog post*.

      > I want the branch structure to show me the state of the tree — I don’t want to have to go looking at timestamps.

      Well I hate to break it to you, but you might have to read timestamps anyway, because the repo is not the be-all end-all of context about some line of development. You very well may need to go back and investigate the ticketing system and email archives, or even talk to the developers in question to piece together what caused the bug and the best way to fix it. The branch history is of marginal value to begin with, but if there are a lot of branches then it can quickly become totally unwieldy (eg. http://img.skitch.com/20100424-fujrjnxfh23akyb41kd45cgwpm.jpg). And as I’ve illustrated above, the branch may or may not have existed depending on the team’s chosen workflow anyway. In practice I’ve found time after time that a linear history is a perfectly reasonable way of visualizing and inspecting the development history of a project. I’ve never looked a sequence and wondered “was this a branch or not?” or “what was the parent of this commit when it was originally written?”. These things simply don’t matter.

      > One of my cardinal rules in dealing with Git: Git is smart. Respect its intelligence. Don’t make it stupid.

      Okay this one is pretty hilarious. You do know that git means stupid right? I mean the name was chosen specifically because of gits simplicity and attempts not to be too smart. For instance, git flags conflicts very conservatively instead of trying fancy algorithms because Linus believes that less is more. The simplicity of git internals are what make it possible to take branches so lightly, and construct an optimal history via interactive rebasing and other techniques.

      > If you have to flatten and then mentally reconstruct the branch structure, then I think you’re making Git stupid — and working too hard yourself.

      I hate to say this, but you’re doing it wrong. You don’t need to reconstruct anything. A linear history works fine as a model of project history. Think of it like the Linux kernel itself, when you look at that source, you want to know what order patches were applied to the master branch, this represents the defacto development of the project. You don’t care that there were 1603 topic branches alive around the world at this time last month. Being able to see this structure is a curiosity, but it is not very useful, let alone necessary.

      I hope you can read this reply carefully and open your mind a little bit, but in any case it’s not my job to combat willful ignorance, so this will be my last word on the matter.

    • http://www.marnen.org Marnen Laibow-Koser

      Gabe:
      > It’s clear that you did not read my article or you did not grok it if you did.

      I have read your article several times. Whether I fully understood it is an open question. That’s why I’m having this discussion.

      > I’ve put in the time and practice to experience what the real world tradeoffs are, something that based on the ignorance displayed in some of your comments you have not done.

      In fact, I have. This is why I have come to the conclusions that I currently hold (I used to use rebase quite a lot).

      Anyway, argument from authority (which is essentially what it looks like you’re doing here) doesn’t impress me at all. Argument from facts and logic does.

      > > you can’t back out a rebase easily, whereas you can back out a merge
      > This is nonsense.

      Really? How do you back out a bad rebase? I’m always willing to learn new Git tricks.

      With a merge, you just do git reset –hard head^ (assuming you remembered to do –no-ff). I can’t think of anything comparable for rebase.

      > If you’re rewriting history you have to be looking carefully at what you’re doing anyway because there’s no guarantee that a critical bug fix wasn’t included in the topic branch.

      But why are you rewriting history in the first place? To me, that’s the hallmark of a broken Git workflow.

      > For instance, I often begin development of a feature on master locally, then get interrupted for a bug fix or something, at which point I checkout a new branch, reset –hard master to origin/master, commit my fix, and then rebase the topic branch.
      [...]
      > None of the rest of the team even knows I did this, and if there hadn’t been a bug to fix, this all would have occurred on master.

      Wait a minute…why would it all have occurred on master? At least in my practice, master is only for integration. Except for quick bug fixes, there’s nothing on master but merge commits. The first thing I do when I start a new feature is make a branch — precisely so I can do it without disturbing master. IMHO master should be deployable at all times.

      The idea is that each feature’s code is independent. We can merge it into master when we’re ready, not before. We can pick and choose which features make it into master once those features are in a state where they’re ready to merge.

      However, if for some reason I had to do what you were describing, I think I’d merge master into the topic branch instead of your last step of rebasing.

      > There’s nothing wrong with losing some amount of history.

      Really? I’d say that a VCS that loses history isn’t really doing its job.

      > Rebasing does not throw away any actual developer commits, and you don’t need to save every bump and wart along the way.

      The more I use Git, the less I agree with that last statement. You don’t need to save every bump and wart, perhaps, but you don’t know which ones are going to be important later as atomic commits, and so you shouldn’t throw the history away prematurely.

      > Another, even more clear cut case where rebasing is beneficial is where you make a commit but make a typo that breaks the build or have a failing test that you missed from some unexpected area. These types of things can break git-bisect and they are just noise.

      If I understand your scenario correctly, to me those are not just noise. They are substantive changes and deserve to be preserved with their commit structure.

      > Well I hate to break it to you, but you might have to read timestamps anyway, because the repo is not the be-all end-all of context about some line of development. You very well may need to go back and investigate the ticketing system and email archives, or even talk to the developers in question to piece together what caused the bug and the best way to fix it.

      But the repo is the be-all and end-all of context about *the code* in some line of development. It is appropriate to keep lines of development navigable and well organized. That means merge commits.

      > You do know that git means stupid right? I mean the name was chosen specifically because of gits simplicity and attempts not to be too smart.

      Of course I know that. The very stupidity of Git’s internals, however, has made it possible for extremely smart history analysis tools to exist — if you structure your repository in a way that doesn’t break them. I’m lazy. I don’t want to do manually what Git will already do for me.

      > The simplicity of git internals are what make it possible to take branches so lightly, and construct an optimal history via interactive rebasing and other techniques.

      IMHO the optimal history is the one that actually occurred. Doing otherwise makes it harder to figure out the logical progression in the line of development.

      > I hate to say this, but you’re doing it wrong. You don’t need to reconstruct anything. A linear history works fine as a model of project history.

      I think that if you believe that, you don’t grok the amazing understanding of your project that a nonlinear history can give you. It’s useful to see the various lines of development merging and branching. That’s information that IMHO should not be thrown away.

      > you want to know what order patches were applied to the master branch, this represents the defacto development of the project. You don’t care that there were 1603 topic branches alive around the world at this time last month.

      No, but if I’m tracking down a bug in a topic branch, I want to be able to see that branch as its own line of development, not as a series of undifferentiated commits from a rebase that I have to wade through timestamps to track down. It really is useful to maintain the separate branches. It is not merely a curiosity.

      > I hope you can read this reply carefully and open your mind a little bit, but in any case it’s not my job to combat willful ignorance

      I’m not being wilfully ignorant, and my mind is open to better methods of doing things. I just don’t think that your method is better, and I hope I have logically explained why I think this.

      I think part of this is a different definition of “better”: you appear to be optimizing your Git workflow with very different values than I am, and throwing away as “noise” many things I consider of paramount importance.

    • http://darwinweb.net Gabe da Silveira

      Thought experiment: After every character you type a commit is made to the repo. Is that better than atomic commits of functional changes?

      When commits happen is a human decision just as using rebase to reorganize history. This idea of “true history” is a red herring. All version control history is a human fabrication. Git gives you more ability to reorganize commits to perfect the published history for future reflection. You can have a policy that no one makes a commit without starting a branch, and master is only merge commits, and that workflow may give you a certain comfort when examining history, but what I’m telling you is that it’s not necessary—a linear composition of commits creates a perfectly serviceable history.

    • http://www.marnen.org Marnen Laibow-Koser

      Gabe:
      > When commits happen is a human decision just as using rebase to reorganize history. This idea of “true history” is a red herring. All version control history is a human fabrication.

      I don’t see how you can say this with a straight face. True, we sometimes break up a big change into several smaller commits for organizational reasons, but generally the commit history reflects the order that things actually happened — at least it does so with my workflow. And that’s the way it should be. I want to be able to look at a branch and know pretty much exactly what the developer did, and in what order.

      > Git gives you more ability to reorganize commits to perfect the published history for future reflection.

      Just because it gives you that power doesn’t mean you should use it as you’re describing. I’d get very cranky if someone tried to “perfect the published history” on a project I was working on by reorganizing commits.

      > You can have a policy that no one makes a commit without starting a branch, and master is only merge commits, and that workflow may give you a certain comfort when examining history, but what I’m telling you is that it’s not necessary—a linear composition of commits creates a perfectly serviceable history.

      And what I’m telling you is that this is *not* a “perfectly serviceable” history. IMHO Git is at its best when it’s not restricted to a linear history, because it can reason about the branching and merging. If you serialize the history, you lose these features.

  • http://jasoncodes.com/ Jason Weathered

    Great post. The sooner we can get everybody to rebase their commits properly before pushing, the better.

    I posted my own version of `gup` a little while ago at http://jasoncodes.com/posts/gup-git-rebase which also handles stashing any local unstaged changes before rebasing. Prompted by your post, I have just added a note about merge commits.

    Cheers.

  • http://www.coderintherye.com K

    Nice work, gives me more confidence towards moving off svn and onto Git soon.

    • S

      Sounds like you don’t trust Git too much. Keep in mind that even though this annoyance that he’s describing is a legitimate concern, it’s one of those problems that come with a system that satisfies much more than SVN could ever. Git’s much more reliable in a million ways, and I could honestly never see myself going back.

  • http://bramcohen.com/ Bram Cohen

    You do realize that you’re going through a huge amount of work to make Git behave an awful lot like CVS.

    Not that this is wrong. It’s a perfectly reasonable thing to do. But it seems like the fundamental data model which all modern VCSs use doesn’t quite map to what people really want to do.

    • Sathya Sekaran

      I don’t see how he’s trying to make Git act more like CVS.

      But I can see why you’d say that. Yes, rebasing before pushing is something Git users do to keep their local history clean (somewhat like CVS). However, it’s just a courteous thing to do, like making sure a UI drawing has clean lines. It does have its tangible benefits (easier bisecting, easier reverting if necessary), but in the end I do it mainly because it’s easier to read and understand the commit history.

      In this blog post, he’s trying to make sure that the merge commit _does_ exist (which CVS/SVN do not have) despite the intermediate commit made before pushing. Git users may be trying to make sure the history is clean and easy to read and understand, but they don’t want it to be linear and have no branch history like CVS either. There’s a balance in there somewhere that I suspect users strive for if they use Git a lot.

      If you don’t care what happens to the history as long as your changes are recorded either way (if you treat it like CVS), then this becomes a non-issue.

      I’ve introduced Git to two companies and taught it to several individuals so far, and usually at first people don’t care about the commit history. They treat it like CVS. But when someone starts enjoying Git, they start paying more attention to what happens before they push a bunch of commits to the central repository. For me it’s also a pride thing.

      That’s why there are tools such as git-flow to help people like myself feel like we’re keeping a clean-looking commit history.

    • http://www.marnen.org Marnen Laibow-Koser

      I don’t think it’s courteous or advisable to keep a Git history “clean”. Every time I’ve heard someone advocate a “clean” commit history, they seem to do so by sweeping things under the rug and creating a distorted picture of what happened.

      I think it’s most courteous to future users to have the commit history actually reflect what happened. That means no lying. git rebase master is a lie, because it claims you branched off master where you didn’t. Don’t lie and dumb down the commit history. Assume your colleagues are smart. Assume Git is smart. Stay truthful. Don’t rebase.

    • http://sfskaran.github.com/ Sathya Sekaran

      @Marnen: You absolutely have good points. However, if we assume our colleagues are smart, and some of them like rebasing, then logically we must admit there are merits alongside those detractions. It’s much less black and white than you make it seem. And much less of a problem, in my opinion. I’ve taught git to tens of developers, and I’ve never seen a time when rebasing was a detraction to understanding the history. In fact, like Gabe da Silveira said, rebasing even helps git-bisect work better.

      The moral is: it doesn’t need to be used, but it’s not a horrible thing when used correctly.

      Just don’t teach it to fledgeling svn -> git users. They’ll inevitably use it for evil. It’s most certainly something you understand better once you’ve been using git for a few months and you’re more familiar with how non-linear history works.

  • http://makeleaps.com Paul Oswald

    You mention (and link to a blog post talking about) using the –no-ff switch to preserve the branch. Have you experimented at all with using –squash on merge? In this way you can add the commit into your master as a single commit.

    • http://envato.com John Barton

      That can give you a problem when trying to git bisect your way back to when a problem was introduced.

      Often these merges represent over a week of work, and having the ability to narrow down where a bug came from beyond “it’s a problem from that project we merged in last week” is pretty important.

    • http://www.marnen.org Marnen Laibow-Koser

      The problem with merge –squash is that it changes history and makes Git stupid. It breaks the link between commits on the branch and the commit in master, so you can’t use any of Git’s history analysis tools (such as branch –contains) to figure out if a commit was merged in. It also breaks git bisect.

      –no-ff, on the other hand, gives you the encapsulation of merge –squash, but without breaking history. It is to be preferred in every case.

      If you’re using merge –squash, then you’re using Git like an inferior Subversion — in fact, Subversion handles this use case better than Git does.

  • http://alanhogan.com/ Alan Hogan

    Thanks for this helpful post. I ran into one problem when using your Bash commands from https://gist.github.com/590895, though: I get

    -bash: git_current_branch: command not found

  • http://lucisferre.net Chris Nicola

    Buyer beware, I did this with a large complicated merge and it created a mess of conflicts to resolve during interactive rebasing that didn’t seem to make any sense. I’m not sure I would recommend this over simple doing the pull as a merge instead of a rebase.

    • http://lucisferre.net Chris Nicola

      Ignore my ranting, it worked fine the second time, must have done something wrong the first time around. Still a small problem is that it appears to just be re-merging, and not applying any changes made in the merge. RERERE helps a bit, but if you made changes that were not part of conflict resolution you are SOL.

    • http://fedupwiththismovie.com Sathya Sekaran

      Yeah, when the chain gets way too complicated to rebase, that’s when pulling is more equitable. I never bothered with rerere.

    • http://www.marnen.org Marnen Laibow-Koser

      rerere is a big help in cases like these. Just turn it on globally — it makes life so much easier.

  • Pingback: Things your mum didn’t tell you about git « The Occasional Blogger

  • http://zanematthew.com/ zane matthew

    I’ve been on the fence about merge with –no-ff, cause some say its really not that important to keep track of those topical branches.

    And it gets harder to maintain them once you introduce a bug/issue tracking system, reason being branch naming convention takes on the following:

    issue-X or bug-x, or issue/x, bug/x i.e. issue-123 or bug-8, issue/123 bug/8

    Really the only important thing is having a clean history and not being bothered with what was on on what branch.

    Do you branch using bug/issue numbers or just with a descriptive name?

    Honestly I’m on the fence between nested branches, merges with –no-ff and naming branches based on issue/bug numbers.

    • http://www.marnen.org Marnen Laibow-Koser

      –no-ff is crucial. When you merge into master, you need to encapsulate the branch into a merge commit. This makes it possible for you to easily see what came from which branch. It also means that if you want to roll back the merge, you only have one merge commit to get rid of.

      The idea that “not being bothered with what was on what branch” is a good thing is exactly backwards, I think. If a bug is introduced, it is nice to be able to see in what context it was introduced. I don’t see what you gain by ignoring history, and you lose a lot (since master becomes a jumble of unrelated commits…).

      As for issue tracking, how does –no-ff make it harder to maintain topic branches in a bug tracker? You can edit the merge commit message as you like. (For the record, I *do* try to put issue numbers in my branch names.)

      There are many advantages to using –no-ff, and no particular disadvantages that I can see. There are no advantages that I can see to *not* using –no-ff, and many disadvantages.

      If you can go into more detail about the specific problems you’re having, I’d be interested to see what I can come up with.

  • http://www.marnen.org Marnen Laibow-Koser

    In general, I don’t like rebasing, because it changes history. When there are new commits to master that aren’t present in my branch, I prefer to merge master into my branch (rebase would also work), and only then merge back into master. This also gives me an opportunity to test the codebase locally before pushing it.

    I don’t think your rebase solution works nearly as well here, because it obscures the history of what *actually* descended from what. If a bug were introduced in one of those commits, it would be harder to track it down.

    • http://zanematthew.com/ Zane Matthew

      Wait, don’t you unit test before/after each commit/merge ;)

      umm….I don’t, lol.

    • http://www.marnen.org Marnen Laibow-Koser

      Yes, of course I test religiously. But bugs do creep in, and git provides great tools for tracking them down — if you don’t confuse it by changing history.

    • Leif Gruenwoldt

      Just like in real life, some history is useful to retain and some is not. Major events are shared in the news and minor ones are forgotten. Likewise is a good strategy for git history. Retain what is useful to share and let go of the rest. What is the point of retaining history if the important events are lost in the noise?

      I simply don’t care to know a fellow developer pulled from origin/master at 3:32pm today and again at 4:15pm and it created two merge commits before he pushed his work. But I do care that his commit(s) apply cleanly to origin/master and in a way that I can inspect simply.

    • http://www.marnen.org Marnen Laibow-Koser

      I think you’re being rather shortsighted. The problem is that history, once gone, is gone forever. It is all too common for a commit that seems like noise now to be useful next week. We don’t know in advance which commits will prove useful. Therefore, we must treat all commits as potentially useful.

      Besides, while you may not care when someone else merged, each merge is a synchronization point. The presence of more merge commits helps the patch apply cleanly — and gives you an intermediate point to fall back to if it doesn’t. This also helps git bisect work more easily.

    • http://darwinweb.net Gabe da Silveira

      @Marnen – You have a poor understanding of how git-bisect works, merge commits make git-bisect work worse, not better. Again, please go read my article.

  • Jagan

    Hi,

    I have two branches G1 and G2 I did the following steps for
    merging

    $ git checkout -b G1 origin/G1
    $ git checkout -b G2 origin/G2

    $ git merge G1

    I have 4 commits along with merge commit.

    But when I create a patch using
    $ git format-patch -M -4
    It will not create a patch for merge commit.

    Can any one help me out this issues.

    Regards,
    Jagan.

  • Thys Swart

    I reckon you don’t need to be cautious when describing yourself as ‘resident Git expert’. Awesome, article. Well written.

  • Pingback: Git – When to Merge vs. When to Rebase | DerekGourlay.com

  • http://tridnguyen.com Tri Nguyen

    I know this is years ago, but here’s a quicker way to get git current branch instead of piping it to `sed`.

    git symbolic-ref –short -q HEAD

    Also, the images on this article are broken right now. Any chance the author could come in and fix that?

  • http://www.orchit.de Patrick Cornelißen

    Nice article, but the images are dead.

    All you get is for example:

    AccessDenied
    Access Denied
    9A7015FC4901E67E

    GHRZBlXEJEl94UGYUBN6x4Hn6gcsVfVqy5/TPil6rXUQpJ0s5vGZFFBJSPx4wViF

    • http://envato.com Adrian Try

      Apologies, Patrick. Unfortunately that brilliant article is a few years old now, and Glen isn’t working with us now. Looks like he hotlinked to images in his Droplr account, and they no longer exist.

  • Stephen Haberman

    When the author says “Branch tracking is not used”, meaning you must always type out “origin/master”:

    git rebase -p origin/master

    You can actually use a special ref, @{upstream}, so:

    git rebase -p @{upstream}

    Or just @{u} for short:

    git rebase -p @{u}

    This is much more amenable to scripts/aliases/etc.

  • Dzmitry
    • http://envato.com Adrian Try

      Hi Dzmitry. Apologies. The article is a few years old, and Glen is no longer working with us. It looks like he hot linked the images to Droplr, and they don’t exist any more.