I have hundred of files to process. Each file contains millions of rows.
Sample file content:
---------------
12
3
5
---------------
8
0
5
---------------
1
5
56
4
---------------
I need to have the output which looks like below (sum of numbers separated by dashes from previous file):
20
13
66
I used while, if, else in conjunction with awk but if/else dramatically slows down the processing.
Any ideas how to use pure awk to speed up calculations?
you don't need if/else blocks,
$ awk 'FNR>1 && /^----/ {print sum; sum=0; next} {sum+=$1}' file{1,2}
20
13
66
20
13
66
for example for the copy of your input file1 and file2. Perhaps you'll run them one at a time or for multiple inputs a prefix before the sums, for example
$ awk 'FNR==1{block=0} FNR>1 && /^----/ {print FILENAME, ++block, sum; sum=0; next}
{sum+=$1}' file{1,2}
file1 1 20
file1 2 13
file1 3 66
file2 1 20
file2 2 13
file2 3 66
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With