I want a bare, shallow clone of a git repo with no file contents, as all I'm interested in are the file paths themselves. This works great:
$ git clone --bare --depth=1 --filter=blob:none --branch="118.0.5977.1" "https://github.com/chromium/chromium.git"
Cloning into bare repository 'chromium.git'...
remote: Enumerating objects: 34624, done.
remote: Counting objects: 100% (34624/34624), done.
remote: Compressing objects: 100% (25673/25673), done.
remote: Total 34624 (delta 1647), reused 21869 (delta 1304), pack-reused 0
Receiving objects: 100% (34624/34624), 13.72 MiB | 16.20 MiB/s, done.
Resolving deltas: 100% (1647/1647), done.
It completes in about 3 seconds, and takes up only 15 MiB on disk. I can get the paths with git ls-tree -r HEAD.
However, various git commands seem to want to fetch additional data from the remote repo. For example
$ cd chromium.git
$ git log
remote: Enumerating objects: 1, done.
remote: Counting objects: 100% (1/1), done.
remote: Total 1 (delta 0), reused 0 (delta 0), pack-reused 0
Receiving objects: 100% (1/1), 372 bytes | 372.00 KiB/s, done.
commit 58a2c380702a84b362d0ee74ffc1e53e937770dd (grafted, HEAD, tag: 118.0.5977.1)
...
Can I tell git not to do this? I would prefer the command to fail rather than fetch any additional data from the remote.
I found the relevant documentation at https://www.git-scm.com/docs/partial-clone/. In particular,
Since almost all Git code currently expects any referenced object to be present locally and because we do not want to force every command to do a dry-run first, a fallback mechanism is added to allow Git to attempt to dynamically fetch missing objects from promisor remotes.
When the normal object lookup fails to find an object, Git invokes promisor_remote_get_direct() to try to get the object from a promisor remote and then retry the object lookup. This allows objects to be "faulted in" without complicated prediction algorithms.
For efficiency reasons, no check as to whether the missing object is actually a promisor object is performed.
Dynamic object fetching tends to be slow as objects are fetched one at a time.
...
Remotes that are considered "promisor" remotes are those specified by the following configuration variables:
extensions.partialClone = <name>remote.<name>.promisor = trueremote.<name>.partialCloneFilter = ...
So if there is a "promisor" remote, git will automatically fetch missing objects from it. To make it not a promisor, all I have to do is
$ git config --unset remote.origin.promisor
$ git config --unset remote.origin.partialclonefilter
and it now gives errors like I want it to:
$ git log
error: unable to read mailmap object at HEAD:.mailmap
commit 58a2c380702a84b362d0ee74ffc1e53e937770dd (grafted, HEAD, tag: 118.0.5977.1)
...
You've done a clone which is not only shallow (--depth=1), but partial (--filter=blob:none). According to the documentation, “[u]se of partial clone requires that the user be online and the origin remote or other promisor remotes be available for on-demand fetching of missing objects.”
If you don't want to have to be online at all times, then keep the shallow option, but re-clone without any --filter options, which will download just a single revision but avoid the need to be online.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With