Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

identify individual entries in column in linux

Tags:

linux

awk

I want to display the individual entries from multiple rows based on a column value. for example in the below example, I want only user from column 4 having most PD entries from column 5 and to display the count of their individual entries in the column 7. example input:

column 4  column 5    column 7
abc           PD      8
xyz           PD      1
abc           PD      2
xyz           PD      7
xyz           PD      3
xyz           R       1

Expected output:

column 4  column 5    column 7
xyz           PD      3

I tried using squeue command as i'm using to find job user information.subsetting for a specific column where PD is maximum in the criterion.

squeue | awk '($5 == "PD")'| awk '{a[$4]+=$7} END{for(i in a) print i,$5,a[i]}'| sort -r -k 3,3| head -n1


squeue | awk '($5 == "PD")'| uniq -r -k 3,3 | head -n1

I am not getting the required answer.

like image 692
Krithika Krishnan Avatar asked Dec 02 '25 21:12

Krithika Krishnan


1 Answers

Could you please try following and let me know if this helps you.

awk 'FNR==1{print;next}{a[$1,$2]++} END{for(i in a){b[a[i]]=i;val=val>a[i]?(val?val:a[i]):a[i]};print b[val]"\t"val}' SUBSEP="\t"  Input_file

Output will be as follows.

column 4  column 5    column 7
xyz     PD      3

Explanation: Adding a non-one liner form of solution too with explanation:

awk '
FNR==1{                                ##FNR==1 condition means when very first line of Input_file is being read.
 print;                                ##printing the current line on standard output then.
 next                                  ##Using next keyword will skip all further statements.
}
{
a[$1,$2]++                             ##Creating an array named a whose index is column 1 and column 2 here, also increasing their occurrences each time a similar entry comes to get the count of column 1 and column 2 as per OPs requirement.
}
END{
 for(i in a){                          ##using for loop to traverse trough array a all element.
   b[a[i]]=i;                          ##creating an array b whose index is the value of array a with index i(means putting array a value into index of array b here) and keeping array b value as i which is the index of array a.
   val=val>a[i]?(val?val:a[i]):a[i]};  ##creating a variable named val here, which will always check if its value is greater than new value of array a or not, if not then it will exchange the value with it, so that we could get the MAX value of column 3.
   print b[val]"\t"val                 ##printing the value of array b with index is val variable and printing TAB then with value of variable val.
}
' SUBSEP="\t" file218                  ##Setting SUBSEP to tab and mentioning Input_file here too.
like image 115
RavinderSingh13 Avatar answered Dec 05 '25 12:12

RavinderSingh13



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!