Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count values between empty cells

Tags:

awk

I'm facing one problem which is bigger than me. I have 18 relative large text files (ca 30k lines each) and I need to count the values between the empty cells in the second column. Here is a simple example of my file:

Metabolism
line_1    10.2
line_2    10.1
line_3    10.3
TCA_cycle
line_4    10.7
line_5    10.8
Pyruvate_metab
line_6   100.8

In reality, I have circa 500 description lines (Metabolism, TCA_cycle, etc.) and the range of lines is between zero to a few hundred.

I would like to count values for each block (block starts with a description and corresponding lines are always below), e.g.

Metabolism  30.6
line_1    10.2
line_2    10.1
line_3    10.3
TCA_cycle   21.5
line_4    10.7
line_5    10.8
Pyruvate_metab 100.8
line_6   100.8

Or just

30.3
21.5
100.8

It won't be a problem if results will be printed line by line into an additional file... Or another alternative way.

There is one tricky thing and it's descriptions without lines with numbers.

Transport
line_1000   100.1
line_1001   100.2
Cell_signal
Motility
Processing
Translation
line_1002   500.1
line_1003   200.2

And even for those lines and would like to get 0 value.

Transport     200.3
line_1000   100.1
line_1001   100.2
Cell_signal   0
Motility      0
Processing    0
Translation   700.3
line_1002   500.1
line_1003   200.2

The rest of the file looks same and it's consistent - 2 columns, tab separators, descriptions in the first column, values in the second, no spaces (only underlines).

Actually I have no experience with more sophisticated coding so I really don't know how to solve it in the command line. I've already tried some Excel ways but it was painful and unsuccessful.

like image 700
Jirka Avatar asked Jul 12 '20 17:07

Jirka


People also ask

How do I use Countif to empty cells?

Because you want to count empty cells, you can use a blank text string as your criteria. To use COUNTIF, open your Google Sheets spreadsheet and click on a blank cell. Type =COUNTIF(range,"") , replacing range with your chosen cell range.

Can you use Isblank for a range of cells?

We can use the ISBLANK coupled with conditional formatting. For example, suppose we want to highlight the blank cells in the range A2:F9, we select the range and use a conditional formatting rule with the following formula: =ISBLANK(A2:F9).


1 Answers

With tac and any awk:

tac file | awk 'NF==2{sum+=$2; print; next} {print $1 "\t" sum; sum=0}' | tac

With two improvements proposed by kvantour and Ed Morton. See the comments.

tac file | awk '($NF+0==$NF){sum+=$2; print; next} {print $1 "\t" sum+0; sum=0}' | tac

See: 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR

like image 92
Cyrus Avatar answered Oct 16 '22 19:10

Cyrus