Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PowerShell Script Efficiency

I use PowerShell as much as possible for quick and easy scripting tasks; A lot of times during my job I will use it for data parsing, log file sifting, or for creating CSV\Text files.

One thing I can't figure out is why it can be very inefficient to perform certain data\IO tasks. I figure it has to do with something under the hood with the way it handles Pipelines or just something I haven't understood yet.

If you take the following logic to generate ABC123 ids, compile it in PowerShell and execute it, it will take less than 1 minute to complete:

$source = @'
    public static System.Collections.Generic.List<String> GetIds()
    {
        System.Collections.Generic.List<String> retValue = new System.Collections.Generic.List<String>();
        for (int left = 97; left < 123; left++)
        {
            for (int middle = 97; middle < 123; middle++)
            {
                for (int right = 97; right < 123; right++)
                {
                    for (int i = 1; i < 1000; i++)
                    {
                        String tmp = String.Format("{0}{1}{2}000", (char)left, (char)middle, (char)right);
                        retValue.Add(String.Format("{0}{1}", tmp.Substring(0, tmp.Length - i.ToString().Length), i));
                    }
                }
            }
        }
        return retValue;
    }
'@
$util = Add-Type -Name "Utils" -MemberDefinition $source -PassThru -Language CSharp

$start = get-date
$ret = $util::GetIds()
Write-Host ("Time: {0} minutes" -f ((get-date)-$start).TotalMinutes)

Now take the same logic, run it through PowerShell without compiling as an assembly and it takes hours to complete

$start = Get-Date
$retValue = @()
for ($left = 97; $left -lt 123; $left++)
{ 
    for ($middle = 97; $middle -lt 123; $middle++)
    { 
        for ($right = 97; $right -lt 123; $right++)
        { 
            for ($i = 1; $i -lt 1000; $i++)
            { 
                $tmp = ("{0}{1}{2}000" -f [char]$left, [char]$middle, [char]$right)
                $retValue += ("{0}{1}" -f $tmp.Substring(0, $tmp.Length - $i.ToString().Length), $i)
            }
        }
    }
}
Write-Host ("Time: {0} minutes" -f ((get-date)-$start).TotalMinutes)

Why is that? Is there some sort of excessive type casting or inefficient operation I am using that slows down performance?

like image 209
The Unique Paul Smith Avatar asked Mar 25 '26 01:03

The Unique Paul Smith


1 Answers

You're killing your performance right here:

$retValue += ("{0}{1}" -f $tmp.Substring(0, $tmp.Length - $i.ToString().Length), $i)

Array additions are a very "expensive" operation. What you're doing is basically creating a brand new array every time, composed of the original array plus the new element.

Edit: This kind of array addition is not only inefficient, but totally unnecessary. All you have to do is simply output those values to the pipeline, and assign the result back to the variable.

$start = Get-Date
$retValue =
for ($left = 97; $left -lt 123; $left++)
{ 
    for ($middle = 97; $middle -lt 123; $middle++)
    { 
        for ($right = 97; $right -lt 123; $right++)
        { 
            for ($i = 1; $i -lt 1000; $i++)
            { 
                $tmp = ("{0}{1}{2}000" -f [char]$left, [char]$middle, [char]$right)
                "{0}{1}" -f $tmp.Substring(0, $tmp.Length - $i.ToString().Length), $i
            }
        }
    }
}
Write-Host ("Time: {0} minutes" -f ((get-date)-$start).TotalMinutes)
Time: 1.866812045 minutes
like image 90
mjolinor Avatar answered Mar 26 '26 14:03

mjolinor



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!