Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

bash: extract only part of tar.gz archive

Tags:

archive

tar

I have a very large .tar.gz file which I can't extract all together because of lack of space. I would like to extract half of its contents, process them, and then extract the remaining half.

The archive contains several subdirectories, which in turn contain files. When I extract a subdirectory, I need all its contents to be extracted with it.

What's the best way of doing this in bash? Does tar already allow this?

like image 320
Ricky Robinson Avatar asked Oct 30 '25 00:10

Ricky Robinson


2 Answers

You can also extract one by one using

tar zxvf file.tar.gz PATH/to/file/inside_archive -C DESTINATION/dir

You can include a script around this:

1) Keep the PATH and DESTINATION same (yes you can use your own base directory for DESTINATION)

2) You can get the path for a file inside archive using

tar -ztvf file.tar.gz

3) You can use a for loop like for files in $(tar -ztvf file.tar.gz | awk '{print $NF}') and define a break condition as per requirement.

I would have done something like:

#!/bin/bash
for files in $(tar -ztvf file.tar.gz| awk '{print $NF}')
do 
subDir=$(dirname $files)
echo $subDir     
tar -C ./My_localDir/${subDir} -zxvf file.tar.gz $files 
done

$subDir contains the name of the sub Directories

Add a break condition to above according to your requirement.

like image 93
PradyJord Avatar answered Nov 01 '25 14:11

PradyJord


You can for example extract only files which match some pattern:

tar -xvzf largefile.tar.gz --wildcards --no-anchored '*.html'

So, depending on the largefile.tar structure one can extract files with one pattern -> process them -> after that delete files -> extract files with another pattern, and so on.