I have a feature branch A. Then I start developing a second feature dependent on A, so I base my new feature branch B on A:
git checkout A
git checkout -B B
I do some work on B, so now on B I have commit 1 (from A) and the new commit 2. Our company always squashes all commits of one PR as much together as possible, so at one point I'm force-pushing A so that A has only commit 1'. Now I want to rebase B to A (or master, after A is merged), but since I force-pushed A, git tries to apply commit 1, which obviously fails.
2 methods to solve this, which both are not great:
using git cherry-pick:
git checkout B
git checkout -B B2
git log // copy latest commit id
git checkout B
git reset --hard A
git cherry-pick <commit-id>
using soft reset:
git checkout B
git reset --soft HEAD~1
git stash
git reset --hard A
git stash pop
git commit -a -m "msg"
Is there a "git method" for sorting this out? I know that it's maybe not best practice to always squash the commits, but that I can't change. Or is there maybe a better way to base one branch on another?
Ultimately, you'll want git rebase --onto. Sometimes you won't need to do anything special, though.
Let's draw your initial situation:
...--A--B <-- master
\
C <-- feature/A
\
D <-- feature/B
That is, there's some series of commits on some main line (I've called it master here but it might be develop or whatever), then one commit on your feature/A, then one commit on your feature/B. The parent of commit D, on your feature/B, is your commit C, on both feature/B and feature/A.
Somewhat later, you've added a second commit to your feature/A, giving:
...--A--B <-- master
\
C--E <-- feature/A
\
D <-- feature/B
Eventually, feature/A is to be merged to master, and per some policy rule, you've made a new commit F that is the combination of C and E so that you now have:
F <-- feature/A
/
...--A--B <-- master
\
C--E [abandoned]
\
D <-- feature/B
At this point you'd like to copy D to some new commit D' that looks exactly like D in terms of its diff against its parent, but where D''s parent is F instead of C.
Git offers an easy-ish way to get what you want:
git checkout feature/B
git rebase --onto feature/A something-goes-here
The problem is the something-goes-here part. What goes there?
The git rebase command is, essentially, just a series of git cherry-pick commands followed by a branch label motion. As you've already discovered, git cherry-pick does what you want: it copies a commit. In fact, it can copy more than one commit (using what Git calls, internally, the sequencer).
That is, it compares each commit to be copied to the commit's parent, to see what changed. Then, it makes those same changes to the current commit, and if all goes well, commits the result.
For instance, let's start with this situation. I've stuck in a new label, saved-A, for the moment, to remember commit E, and I've added the name new-B and added HEAD in parentheses to show that current branch is new-B and the current commit is commit F:
F <-- feature/A, new-B (HEAD)
/
...--A--B <-- master
\
C--E <-- saved-A
\
D <-- feature/B
We can now run git cherry-pick feature/B. We're telling Git: Compare commit D to its parent C, then make the same changes to where we are now, at commit F, and commit the result. If all goes well, we get:
D' <-- new-B (HEAD)
/
F <-- feature/A
/
...--A--B <-- master
\
C--E <-- saved-A
\
D <-- feature/B
All we need to do now is yank the name feature/B over to point to commit D', and then drop the name new-B:
D' <-- feature/B (HEAD)
/
F <-- feature/A
/
...--A--B <-- master
\
C--E <-- saved-A
\
D [abandoned]
Again, the first part of this is just what git cherry-pick does: copy one commit. The last part of this is something that git rebase does: move a branch label like feature/B.
The key here is that git rebase copies some commits. Which ones? The default answer is wrong answer for you!
git rebase does, in a nutshellLet's look at a slightly different drawing:
...--A--B <-- target
\
C--D--E <-- current (HEAD)
Here, we are "on" branch current, i.e., git status will say on branch current. The tip commit of current is commit E: E's hash ID is the hash ID stored in the name refs/heads/current.
If we now run:
git rebase target
Git will copy commits C-D-E to new commits C'-D'-E' and place the new commits atop target and then move the branch name, like this:
C'-D'-E' <-- current (HEAD)
/
...--A--B <-- target
\
C--D--E [abandoned]
That's usually what we want. But: How did git rebase know to copy C-D-E but not to copy A too?
The answer is that git rebase uses Git's internal "list some commits" operation, git rev-list, with a stop point. The rebase documentation claims that what git rebase does is run:
git rev-list target..HEAD
which is a bit of a white lie: it's close enough, and illustrative. The exact details are trickier and we'll get there in a bit. For now, let's look at the target.. part of target..HEAD. This tells Git: don't list any commits that you can find by starting at the target and working backwards.
Since target names commit B, that means: don't copy commit B. Well, we already weren't going to copy commit B, so no big deal. But it also means: don't copy commit A. Why not? Because commit B points back to commit A. Commit A is on both branches, target and current. So we would have copied A, but we don't, because it's on the do not copy list. There are commits before A too, but they are all on the do not copy part, so none of them get copied.
Hence it's commits C-D-E that get copied here: they are on the to-copy list, and not stopped-out by being on the do-not-copy list.
So, what git rebase does, in a nutshell, is this:
HEAD is attached to.HEAD from the current branch.git cherry-pick.HEAD was attached to, to where we are now.HEAD to the moved branch.Note that things can go wrong during step 4. In particular, copying a commit, as if by git cherry-pick—whether or not this actually uses git cherry-pick—can have a merge conflict. If so, the rebase stops in the middle, with a detached HEAD. That's why knowing about step 3 is important. But we'll leave that for other question-and-answers (along with the details about whether rebase actually does use cherry-pick itself: sometimes it does, sometimes it fakes it).
We mentioned that the target..HEAD thing above was a white lie: a simplification, meant to make it easier to comprehend which commits get copied. It's time for the truth now.
First, git rebase normally omits merge commits entirely. Any commit that would be generated by the git rev-list above is knocked out if it's a merge (has two or more parents). As long as you have no merge commits in your list, this doesn't matter anyway.
Second, git rebase also omits commits that are patch-ID equivalent to some other commits. This uses the git patch-id program. We won't go into details here, other than to observe that to get the "some other commits" part, Git actually has to use git rev-list target...HEAD, with three dots. This produces a symmetric difference list, of commits reachable from HEAD but not target, and also commits reachable from target but not HEAD. For (much) more about reachability, see Think Like (a) Git. The rebase command then uses git patch-id on each commit in the two lists—which it's generated internally so it knows which commit hash goes with which list—and knocks out those that have matching patch IDs. The effect of this is that if commit B, for instance, is already the same (cherry-pick-wise) as commit D, instead of copying C-D-E, we'll just copy C-E, to get:
C'-E' <-- current (HEAD)
/
...--A--B <-- target
\
C--D--E [abandoned]
because commits B and D "do the same thing".
Last, and most important to us here, --onto lets us use a different target.
In the example above, we ran:
git rebase target
and target was both our stop argument for git rev-list stop..HEAD and our target, for where Git put the copies. But we can run:
git rebase --onto target stop
and now git rebase will use our stop argument for the stop part of the git rev-list, while continuing to use our target argument for the place the copies go.
So, suppose we're given this now:
...--A--B <-- target
\
C <-- another
\
D--E <-- current (HEAD)
and we run:
git rebase --onto target another
We've now told Git that the stop argument for our rebase is another, which selects commit C. Our rebase will use git rev-list on another..HEAD, or C..E, which means that the list of commits to copy will consist of just D-E.
That list will get further filtered by the patch-id and no-merges rules, but as long as B isn't the same as D, we'll end up with:
D'-E' <-- current (HEAD)
/
...--A--B <-- target
\
C <-- another
\
D--E [abandoned]
That is, we'll copy only the two commits D-E that are reachable from current, omitting commit C that is reachable from another.
Here's your setup at the time you want to do the commit copying:
F <-- feature/A
/
...--A--B <-- master
\
C--E <-- saved-A
\
D <-- feature/B (HEAD)
Note that we've added the name saved-A to remember what not to copy. We don't want to copy commits C and E. We wouldn't copy E anyway, but this is an easy way to remember everything not to copy.
We currently have feature/B checked out (commit D). We don't need to create a name, new-B, so we did not do that. Now we just run:
git rebase --onto feature/A saved-A
Git will now list out the commits to copy: every commit that's on the current branch, feature/B, except for every commit that's on saved-A. So that's commit D.
Git now detaches HEAD, moves to commit F—our --onto target—and copies D to produce D'. That finishes the list of commits to copy, so having successfully copied D to D', Git forcibly moves the name feature/B to point to D' and re-attaches HEAD, giving us:
D' <-- feature/B (HEAD)
/
F <-- feature/A
/
...--A--B <-- master
\
C--E <-- saved-A
\
D [abandoned]
which is just what we want.
We can now delete the name saved-A.
What if you've rebased feature/A already but forgot to save the commit hash ID of commit E somewhere?
Fortunately, you don't have to have saved the hash ID of either E or C. You can:
git log, orgit reflog to find the hash IDs that feature/A used to name, orRaw hash IDs work, so you can just run:
git rebase --onto feature/A <hash-ID-of-E-or-C>
after you find the hash ID. (Use cut-and-paste or similar to get the hash ID right; typing it, or even a unique prefix of it, in by hand is a recipe for errors.)
Reflog names also work, so pretty often you can do:
git rebase --onto feature/A feature/A@{1}
where feature/A@{1} is the reflog name you'd see for the hash ID of commit E, when you run git reflog feature/A to list out the previous hash IDs for feature/A. (feature/A@{2} probably names commit C, so that would also work.)
The key is to find the commit(s) you want to omit, and use those with git rebase --onto. Set the target based on where the copies should go, and set the stop-point—the thing that the git rebase documentation calls the upstream argument—to a hash ID that stops out the commits you don't want copied.
If your squashed commits have the same patch ID as the original commits, git rebase's leave out commits that have matching patch IDs will do all the work for you. This will generally only happen if you only had one commit that got squash-merged to some other branch.
The --onto trick always works, so you don't really need to worry about this case, but if it happens a lot, it's nice to know.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With