Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is an efficient way in C# of doing MD5 and download all at once?

I'm working on download and then MD5 check to ensure the download is successful. I have the following code which should work, but isn't the most efficient - especially for large files.

        using (var client = new System.Net.WebClient())
        {
            client.DownloadFile(url, destinationFile);
        }

        var fileHash = GetMD5HashAsStringFromFile(destinationFile);
        var successful = expectedHash.Equals(fileHash, StringComparison.OrdinalIgnoreCase);

My concern is that the bytes are all streamed through to disk, and then the MD5 ComputeHash() has to open the file and read all the bytes again. Is there a good, clean way of computing the MD5 as part of the download stream? Ideally, the MD5 should just fall out of the DownloadFile() function as a side effect of sorts. A function with a signature like this:

string DownloadFileAndComputeHash(string url, string filename, HashTypeEnum hashType);

Edit: Adds code for GetMD5HashAsStringFromFile()

    public string GetMD5HashAsStringFromFile(string filename)
    {
        using (FileStream file = File.Open(filename, FileMode.Open, FileAccess.Read, FileShare.Read))
        {
            var md5er = System.Security.Cryptography.MD5.Create();
            var md5HashBytes = md5er.ComputeHash(file);
            return BitConverter
                    .ToString(md5HashBytes)
                    .Replace("-", string.Empty)
                    .ToLower();
        }
    }
like image 275
davidpricedev Avatar asked Dec 06 '25 04:12

davidpricedev


1 Answers

Is there a good, clean way of computing the MD5 as part of the download stream? Ideally, the MD5 should just fall out of the DownloadFile() function as a side effect of sorts.

You could follow this strategy, to do "chunked" calculation and minimize memory pressure (and duplication):

  1. Open the response stream on the web client.
  2. Open the destination file stream.
  3. Repeat while there is data available:
    • Read chunk from response stream into byte buffer
    • Write it to the destination file stream.
    • Use the TransformBlock method to add the bytes to the hash calculation
  4. Use TransformFinalBlock to get the calculated hash code.

The sample code below shows how this could be achieved.

public static byte[] DownloadAndGetHash(Uri file, string destFilePath, int bufferSize)
{
    using (var md5 = MD5.Create())
    using (var client = new System.Net.WebClient())
    {
        using (var src = client.OpenRead(file))
        using (var dest = File.Create(destFilePath, bufferSize))
        {
            md5.Initialize();
            var buffer = new byte[bufferSize];
            while (true)
            {
                var read = src.Read(buffer, 0, buffer.Length);
                if (read > 0)
                {
                    dest.Write(buffer, 0, read);
                    md5.TransformBlock(buffer, 0, read, null, 0);
                }
                else // reached the end.
                {
                    md5.TransformFinalBlock(buffer, 0, 0);
                    return md5.Hash;
                }
            }
        }
    }
}
like image 149
Alex Avatar answered Dec 07 '25 18:12

Alex