Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

awk and sum rows for large files

I have hundred of files to process. Each file contains millions of rows.

Sample file content:

---------------
12
3
5
---------------
8
0
5
---------------
1
5
56
4
---------------

I need to have the output which looks like below (sum of numbers separated by dashes from previous file):

20
13
66

I used while, if, else in conjunction with awk but if/else dramatically slows down the processing.

Any ideas how to use pure awk to speed up calculations?

like image 478
Tasior_Miedziak Avatar asked Dec 07 '25 14:12

Tasior_Miedziak


1 Answers

you don't need if/else blocks,

$ awk 'FNR>1 && /^----/ {print sum; sum=0; next} {sum+=$1}' file{1,2} 
20
13
66
20
13
66

for example for the copy of your input file1 and file2. Perhaps you'll run them one at a time or for multiple inputs a prefix before the sums, for example

$ awk 'FNR==1{block=0} FNR>1 && /^----/ {print FILENAME, ++block, sum; sum=0; next} 
                                        {sum+=$1}' file{1,2} 

file1 1 20
file1 2 13
file1 3 66
file2 1 20
file2 2 13
file2 3 66
like image 87
karakfa Avatar answered Dec 11 '25 09:12

karakfa



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!