I'm a bit puzzled by this...
I have two branches that have the same series of commits in both of them.
The true history is that they were authored by my colleague, committed and pushed to github on branch A. At some stage I merged branch A with my B branch.
What git now appears to show is his commits in branch A, with their hashes, and the same commits in my (diverged) branch, showing me as author, and a different set of hashes, intermingled with the work I was doing on my branch.
This feels like some sort of rebase issue, (we both use GitHubForWindows some of the time which does rebase as part of sync) but I'm not aware of an issue being reported to either of us.
Any ideas on what caused this, or how to get it straight would be appreciated.
Yes, two branches can point to the same commits.
The SHA1 of the commit is the hash of all the information. And because this hash is unique to its content, a commit can't change. If you change any data about the commit, it will have a new SHA1.
If two distinct objects have the same hash, this is known as a collision. Git can only store one half of the colliding pair, and when following a link from one object to the colliding hash name, it can't know which object the name was meant to point to.
You should get some power tool (plain gitk should do just fine) and closely inspect the matching (but differing in hashes) commits — look for differences in the Author, Committer and Date fields. Also compare the hashes of the parent commits because a commit objects also records the hashes of their parent commits and so otherwise identical commits referring to different parent SHA-1 commit names will be different.
Also could you elaborate on how precisely your commits are "intermingled" with those authored by your peer? Do all those commits form a linear history or there are merge points?
The former would indicate that rebasing was used.
With the information available so far I would do this:
Stop using "Github for Windows" as no-brainer solutions tend to create situations you're facing right now: when something breaks you have no idea why it broke and how to unbreak.
Get "regular" Git for Windows (and may be Git Extensions if you want fancy GUI which does not try to outsmart the user).
Save your current feature branch away by forking another branch off it.
(Hard-)reset your feature branch to that of your peer.
Cherry-pick your changes from oldest to the newest from the branch you saved.
This might create conflicts (since these commits will be planted onto a different state of code they were originally created).
In the result you will have a branch which has no "spuriously same" commits.
Then both you and your peer should read up on merging and rebasing workflows, adopt one of them and then, when working on feature branches, do either merging and/or rebasing sensibly, understanding why you're doing this and what happens as a result. I would advise you to not blindly rely on a tool to do the Right Thing™.
If git rebase is part of your workflow, then what you describe is common.  For example:
$ git log --graph --oneline --all
* 76af430 fc           # branch: foo
| * 7c495ad mb         # branch: bar, master
|/  
* 74cbb35 a
$ git rebase foo       # while on branch master
First, rewinding head to replay your work on top of it...
Applying: mb
$ git log --graph --oneline --all
* 6810e67 mb           # branch: master
* 76af430 fc           # branch: foo
| * 7c495ad mb         # branch: bar
|/  
* 74cbb35 a
I encountered this "git diff diff" issue after rebasing two branches that were in series. The same commit was applied to the same fork-point so I was puzzled to see the branches diverge. Even the patch-id was the same.
Looking at the raw diff revealed it was the "committer time" that was the difference:
$ diff <(git show --format=raw $COMMIT1) \
       <(git show --format=raw $COMMIT2)
1c1
< commit $COMMIT1
---
> commit $COMMIT2
5c5
< committer $ME <[email protected]> 1470128045 +0200
---
> committer $ME <[email protected]> 1470129095 +0200
Redoing the rebase with --committer-date-is-author-date on the git rebase fixed some of the divergence, but not all.
(I'm not sure why..?  I think the divergence happened at the first rerere merge)
I then used filter-branch as a sledgehammer:
git filter-branch --env-filter \
'export GIT_COMMITTER_DATE=$GIT_AUTHOR_DATE'\
origin/master..HEAD
This was enough to keep the series in a line:
$ git show --format=raw HEAD | egrep 'author|committer'
author $ME <[email protected]> 1470065063 +0200
committer $ME <[email protected]> 1470065063 +0200
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With