For my WPF project, I have to calculate the total file size in a single directory (which could have sub directories).
Sample 1
DirectoryInfo di = new DirectoryInfo(path);
var totalLength = di.EnumerateFiles("*.*", SearchOption.AllDirectories).Sum(fi => fi.Length);
if (totalLength / 1000000 >= size)
return true;
Sample 2
var sizeOfHtmlDirectory = Directory.GetFiles(path, "*.*", SearchOption.AllDirectories);
long totalLength = 0;
foreach (var file in sizeOfHtmlDirectory)
{
totalLength += new FileInfo(file).Length;
if (totalLength / 1000000 >= size)
return true;
}
Both samples work.
Sample 1 complete in a massivly faster time. I've not timed this accurately but on my PC, using the same folder with the same content/file sizes, Sample 1 takes a few seconds, Sample 2 takes a few minutes.
EDIT
I should point out, the bottle neck in Sample 2 is within the foreach loop! It reads the GetFiles quickly and enters the foreach loop quickly.
My question is, how do I find out why this is the case?
EnumerateFiles(String, String, EnumerationOptions) Returns an enumerable collection of full file names that match a search pattern and enumeration options in a specified path, and optionally searches subdirectories. EnumerateFiles(String) Returns an enumerable collection of full file names in a specified path.
To calculate the size of a folder in C#, use the Directory. EnumerateFiles Method and get the files. Creates all directories and subdirectories in the specified path unless they already exist. Creates all the directories in the specified path, unless the already exist, applying the specified Windows security.
To enumerate directories and files, use methods that return an enumerable collection of directory or file names, or their DirectoryInfo, FileInfo, or FileSystemInfo objects. If you want to search and return only the names of directories or files, use the enumeration methods of the Directory class.
Contrary to what the other answers indicate the main difference is not EnumerateFiles
vs GetFiles
- it's DirectoryInfo
vs Directory
- in the latter case you only have strings and have to create new FileInfo
instances separately which is very costly.
DirectoryInfo
returns FileInfo
instances that use cached information vs directly creating new FileInfo
instances which does not - more details here and here.
Relevant quote (via "The Old New Thing"):
In NTFS, file system metadata is a property not of the directory entry but rather of the file, with some of the metadata replicated into the directory entry as a tweak to improve directory enumeration performance. Functions like FindFirstFile report the directory entry, and by putting the metadata that FAT users were accustomed to getting "for free", they could avoid being slower than FAT for directory listings. The directory-enumeration functions report the last-updated metadata, which may not correspond to the actual metadata if the directory entry is stale.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With