I have a a directory with a structure like so:
.
├── Test.txt
├── Test1
│ ├── Test1.txt
│ ├── Test1_copy.txt
│ └── Test1a
│ ├── Test1a.txt
│ └── Test1a_copy.txt
└── Test2
├── Test2.txt
├── Test2_copy.txt
└── Test2a
├── Test2a.txt
└── Test2a_copy.txt
I would like to create a bash script that makes a md5 checksum of every file in this directory. I want to be able to type the script name in the CLI and then the path to the directory I want to hash and have it work. I'm sure there are many ways to accomplish this. Currently I have:
#!/bin/bash
for file in "$1" ; do
md5 >> "${1}__checksums.md5"
done
This just hangs and it not working. Perhaps I should use find?
One caveat - the directories I want to hash will have files with different extensions and may not always have this exact same tree structure. I want something that will work in these different situations, as well.
Now that we can get a list with all of our files, our next steps are: Run the md5sum command on every file in that list. Create a string that contains the list of file paths along with their hashes. And finally, run md5sum on this string we just created to obtain a single hash value.
Checksums are calculated for files. Calculating the checksum for a directory requires recursively calculating the checksums for all the files in the directory. The -r option allows md5deep to recurse into sub-directories. The -l option enables displaying the relative path, instead of the default absolute path.
Open a terminal window. Type the following command: md5sum [type file name with extension here] [path of the file] -- NOTE: You can also drag the file to the terminal window instead of typing the full path. Hit the Enter key. You'll see the MD5 sum of the file.
Most commonly, md5sum is used to verify that a file has not changed as a result of a faulty file transfer, a disk error or non-malicious modification. Every file in the GDC contains an md5sum to ensure file integrity.
md5deep
md5deep -r path/to/dir > sums.md5
find and md5sum
find relative/path/to/dir -type f -exec md5sum {} + > sums.md5
Be aware, that when you run check on your MD5 sums with md5sum -c sums.md5, you need to run it from the same directory from which you generated sums.md5 file. This is because find outputs paths that are relative to your current location, which are then put into sums.md5 file.
If this is a problem you can make relative/path/to/dir absolute (e.g. by puting $PWD/ in front of your path). This way you can run check on sums.md5 from any location. Disadvantage is, that now sums.md5 contains absolute paths, which makes it bigger.
find and md5sum
You can put this function to your .bashrc file (located in your $HOME directory):
function md5sums {
if [ "$#" -lt 1 ]; then
echo -e "At least one parameter is expected\n" \
"Usage: md5sums [OPTIONS] dir"
else
local OUTPUT="checksums.md5"
local CHECK=false
local MD5SUM_OPTIONS=""
while [[ $# > 1 ]]; do
local key="$1"
case $key in
-c|--check)
CHECK=true
;;
-o|--output)
OUTPUT=$2
shift
;;
*)
MD5SUM_OPTIONS="$MD5SUM_OPTIONS $1"
;;
esac
shift
done
local DIR=$1
if [ -d "$DIR" ]; then # if $DIR directory exists
cd $DIR # change to $DIR directory
if [ "$CHECK" = true ]; then # if -c or --check option specified
md5sum --check $MD5SUM_OPTIONS $OUTPUT # check MD5 sums in $OUTPUT file
else # else
find . -type f ! -name "$OUTPUT" -exec md5sum $MD5SUM_OPTIONS {} + > $OUTPUT # Calculate MD5 sums for files in current directory and subdirectories excluding $OUTPUT file and save result in $OUTPUT file
fi
cd - > /dev/null # change to previous directory
else
cd $DIR # if $DIR doesn't exists, change to it to generate localized error message
fi
fi
}
After you run source ~/.bashrc, you can use md5sums like normal command:
md5sums path/to/dir
will generate checksums.md5 file in path/to/dir directory, containing MD5 sums of all files in this directory and subdirectories. Use:
md5sums -c path/to/dir
to check sums from path/to/dir/checksums.md5 file.
Note that path/to/dir can be relative or absolute, md5sums will work fine either way. Resulting checksums.md5 file always contains paths relative to path/to/dir.
You can use different file name then default checksums.md5 by supplying -o or --output option. All options, other then -c, --check, -o and --output are passed to md5sum.
First half of md5sums function definition is responsible for parsing options. See this answer for more information about it. Second half contains explanatory comments.
How about:
find /path/you/need -type f -exec md5sum {} \; > checksums.md5
Update#1: Improved the command based on @twalberg's recommendation to handle white spaces in file names.
Update#2: Improved based on @jil's suggestion, to remove unnecessary xargs call and use -exec option of find instead.
Update#3: @Blake a naive implementation of your script would look something like this:
#!/bin/bash
# Usage: checksumchecker.sh <path>
find "$1" -type f -exec md5sum {} \; > "$1"__checksums.md5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With