Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Will git store diffs of binary files that change in content, but never change size?

Tags:

git

I am interested in storing an EEPROM HEX file of fixed size in git. The files will NEVER change size, but they will change content frequently.

If I add an EEPROM file to git and commit it, then I change a few bytes in the file, will git store this change efficiently over dozens or hundreds of commits?

In my research on this issue, I've run across some thorough discussions on the topic, but most of them seem to deal with files like PDFs and MP3s which nobody expects to stay the same or be comparable in a diff. I wonder if EEPROM HEX files would be treated differently since the file size stays the same?

EDITED (again)

Some initial observations... (Kudos to Krumelur for the "just try it" encouragement!)

The file that I am testing is a 7MB Intel HEX file. Based on the output from git, it appears to treat this file as a text file:

$ git commit -m "Changed a single byte."
[master bc2958b] Changed a single byte.
1 file changed, 1 insertion(+), 1 deletion(-)

The diff output matches as well:

$ git show bc2958b
commit bc2958b[...]
Author: ThoughtProcess <[email protected]>
Date:   Wed Jul 31 11:53:41 2013 -0500

    Changed a single byte.

diff --git a/test.hex b/test.hex
index fbdeed4..04d19b6 100644
--- a/test.hex
+++ b/test.hex
@@ -58,7 +58,7 @@
 :20470000000000000000000000000000000000000000000000000000E001EDD0D9310D00E4
 :20472000400200000080000000000000000000000000000000000000E002EDD0CF310D000B
 :20474000400200000080000000000000000000000000000000000000E0036D0063040D00D3
-:2047600040020000008000000000000000000000000000000000000000A0FF2F06801B0FF9
+:2047600040020000008000000000000000000000000000000000000000A0FF2G06801B0FF9
 :2047800000E01D007A00820F3CFB000000000000000000000000000000A0FF8F06801B1FEC
 :2047A00000E01D006A00821F3CFB000000000000000000000000000000A0FF6F06801B8F7C
 :2047C00000E01D005A00821F3CFB000000000000000000000000000000A0FF8F06801BDFFC

After 7 commits, the repository size is now 21MB. Here's the strange thing, I've noticed that the repository seems to grow by a roughly linear size (2MB) with each commit. Is that simply how git is designed to work? Or is it not storing the incremental differences as text like I'd expect?

like image 552
ThoughtProcess Avatar asked Mar 23 '23 23:03

ThoughtProcess


1 Answers

git is actually storing a new full copy of your file(s) somewhere under .git/objects so your repository does indeed grow linearly. You can run git gc to make git pack the repository. In case of your data, git should be able to pack really efficiently, and your repository should get much smaller. (git will also automatically run git gc occasionally.)

like image 147
Sampo Smolander Avatar answered Apr 07 '23 18:04

Sampo Smolander