I have a csv file that contains approx 250,000 rows. The first of the headers on this csv file is "Campaign Ref Code". I also have a list of campaign reference codes that I need to look up in this first column (ie COLMABQ140, COLMABQ141). If I find this campaign ref code in the first column I want to output the information for this campaign code (ie address, date contacted etc) into a separate csv file. I have a number of these codes and need to produce a number of separate files specific to each separate campaign code. I have actually managed to get a working script for this BUT it takes over an hour to run and I'd like to find a quicker way of processing this file. Basically, I create an array of the campaign codes then loop through them using Import-csv & exporting to a separate file using Export-csv if I find a match (see a snippet of the code below - there are 20 different $refs arrays & corresponding for loops). Like I say, it does work perfectly well in as much as it gets the information I need in a format I specified but there has to be a quicker way of doing things...even if the original file is a quarter of a million rows!!
$file = 'Orig.csv'
$newfile = "File1.$today.csv"
$refs = @('COLMABQ140','COLMABQ141','COLMABQ142','COLMABQ143','COLMABQ144','COLMABQ176','COLMABQ177','COLMABQ178','COLMABQ179','COLMABQ180')
foreach ($ref in $refs) {
Import-csv $file | Where-Object {$_.'Campaign Ref Code' -like "$ref"} | Export-CSV -notype -Path $workdir\$newfile -Append
}
$newfile = "File2.$today.csv"
$refs = @('COLMABP140','COLMABP141','COLMABP142','COLMABP143','COLMABP144','COLMABP176','COLMABP177','COLMABP178','COLMABP179','COLMABP180')
foreach ($ref in $refs) {
Import-csv $file | Where-Object {$_.'Campaign Ref Code' -like "$ref"} | Export-CSV -notype -Path $workdir\$newfile -Append
}
$newfile = "File3.$today.csv"
$refs = @('COLMABS140','COLMABS141','COLMABS142','COLMABS143','COLMABS144','COLMABS176','COLMABS177','COLMABS178','COLMABS179','COLMABS180')
foreach ($ref in $refs) {
Import-csv $file | Where-Object {$_.'Campaign Ref Code' -like "$ref"} | Export-CSV -notype -Path $workdir\$newfile -Append
}
The performance issues with your script are:
Orig.csv input file.$refs values to test for.-eq operator is probably slightly faster than the -like operator-like operator as you don't have any wildcards in the values.)When the input of an operator is a scalar value, the operator returns a Boolean value. When the input is a collection, the operator returns the elements of the collection that match the right-hand value of the expression. If there are no matches in the collection, comparison operators return an empty array.
$workdir = ...
$refs =@{
"File1.csv" = 'COLMABQ140','COLMABQ141','COLMABQ142','COLMABQ143','COLMABQ144','COLMABQ176','COLMABQ177','COLMABQ178','COLMABQ179','COLMABQ180'
"File2.csv" = 'COLMABP140','COLMABP141','COLMABP142','COLMABP143','COLMABP144','COLMABP176','COLMABP177','COLMABP178','COLMABP179','COLMABP180'
"File3.csv" = 'COLMABS140','COLMABS141','COLMABS142','COLMABS143','COLMABS144','COLMABS176','COLMABS177','COLMABS178','COLMABS179','COLMABS180'
}
$Pipelines = @{}
Import-csv .\Orig.csv |ForEach-Object -Begin {
foreach ($file in $refs.keys) {
$Pipelines[$file] = { Export-CSV -notype -Path "$workdir\$file" }.GetSteppablePipeline()
$Pipelines[$file].Begin($True)
}
} -Process {
foreach ($file in $refs.keys) {
if ($refs[$file] -eq $_.'Campaign Ref Code') { $Pipelines[$file].Process($_) }
}
} -End {
foreach ($file in $refs.keys) {
$Pipelines[$file].End()
}
}
Mastering the (steppable) pipeline
The power of the PowerShell Pipeline is often misunderstood and underestimated (especially by experienced programmers of languages that don't have a similar feature). Therefore, I have written a PowerShell community blogpost: Mastering the (steppable) pipeline.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With