Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I delete a single file from a tar.gz archive

Tags:

gzip

tar

I have a huge tarbell archive with an excessively large or corrupt error_log that causes the archive to hang when attempting to extract it. Is there a way to remove this from the archive before unzipping or extract the archive without extracting that specific file on Mac OS X terminal?

I found this post on how to efficiently-remove-files-from-large-tgz however, I tried the --delete flag, but received this error:

tar: Option --delete is not supported

Is there a way to:

  1. remove the file from the archive without unzipping it?
  2. extract the archive but exclude the file?
like image 676
KillerDesigner Avatar asked Jun 21 '15 08:06

KillerDesigner


People also ask

How do I delete a tar archive file?

You can remove members from an archive by using the ' --delete ' option. Specify the name of the archive with ' --file ' (' -f ') and then specify the names of the members to be deleted; if you list no member names, nothing will be deleted.

How do I extract only certain files from a tar?

Now, if you want a single file or folder from the “tar” file, you need to use the name of the “tar” file and the path to a single file in it. So, we have used the “tar” command with the “-xvf” option, the name of the “tar” file, and the path of a file to be extracted from it as below.

Can I delete gz files?

You can safely delete the . gz file and it will cause no harm to your public_html or anything else.


4 Answers

As mentioned in the comments it's not possible to remove the file using tar, but you can exclude the file when extracting:

tar -zxvf file.tar.gz --exclude "file_to_exclude"
like image 162
msfoster Avatar answered Oct 15 '22 22:10

msfoster


You can repackage it like this:

tar -czvf ./new.tar.gz --exclude='._*' @old.tar.gz

I used ._* to remove all ._files, but you can use any pattern you like, including a full path, directory, filename, or whatever.

like image 40
voices Avatar answered Oct 15 '22 20:10

voices


I did that in tree steps. Hopefully will help others in the future.

gzip -d file.tar.gz
tar -f file.tar --delete folder1/file1.txt --delete folder2/file2.txt
gzip -9 file.tar

If you have multiple files use this. But the archives them must have all the files you want to delete, or tar will give a error.

for f in *.tar.gz
do
        echo "Processing file $f"
        gzip -d "$f"
        tar -f "${f%.*}" --delete folder1/file1.txt --delete folder2/file2.txt
        gzip -9 "${f%.*}"
done
like image 40
Ninel Avatar answered Oct 15 '22 20:10

Ninel


I wanted to remove the jdk directory from the elasticsearch-oss archive with a one liner, and this is what I came up with:

gzip -d elasticsearch-oss-7.10.1-linux-x86_64.tar.gz -c | tar --delete --wildcards */jdk | gzip - > /tmp/tmp.$$.tar.gz && mv /tmp/tmp.$$.tar.gz elasticsearch-oss-7.10.1-linux-x86_64.tar.gz

I further refined this to include the download:

curl -Ss https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-oss-7.10.1-linux-x86_64.tar.gz | gzip -d - -c | tar --delete --wildcards */jdk | gzip - > elasticsearch-oss-7.10.1-linux-x86_64.tar.gz

Works a treat on ubuntu 20.04, so gnu tar which does not support the @ sign.

like image 45
Jason Pell Avatar answered Oct 15 '22 20:10

Jason Pell