I have a list of student records, grades, that I want to sort by GPA, returning the top 5 results. For some reason count<=7 and below cuts off the top result. I can't figure out why that is.
Also, is there a more elegant way to remove the first column after sorting than piping the results back in to awk from sort?
user@machine:~> awk '{ if (count<=7) print $3, $0; count++; }' grades | sort -nr | awk '{ print $2 " " $3 " " $4 " " $5 }'
Ahmad Rashid 3.74 MBA
James Davis 3.71 ECE
Sam Chu 3.68 ECE
John Doe 3.54 ECE
Arun Roy 3.06 SS
James Adam 2.77 CS
Al Davis 2.63 CS
Rick Marsh 2.34 CS
user@machine:~> awk '{ if (count<=8) print $3, $0; count++; }' grades | sort -nr | awk '{ print $2 " " $3 " " $4 " " $5 }'
Art Pohm 4.00 ECE
Ahmad Rashid 3.74 MBA
James Davis 3.71 ECE
Sam Chu 3.68 ECE
John Doe 3.54 ECE
Arun Roy 3.06 SS
James Adam 2.77 CS
Al Davis 2.63 CS
Rick Marsh 2.34 CS
grades:
John Doe 3.54 ECE
James Davis 3.71 ECE
Al Davis 2.63 CS
Ahmad Rashid 3.74 MBA
Sam Chu 3.68 ECE
Arun Roy 3.06 SS
Rick Marsh 2.34 CS
James Adam 2.77 CS
Art Pohm 4.00 ECE
John Clark 2.68 ECE
Nabeel Ali 3.56 EE
Tom Nelson 3.81 ECE
Pat King 2.77 SS
Jake Zulu 3.00 CS
John Lee 2.64 EE
Sunil Raj 3.36 ECE
Charles Right 3.31 EECS
Diane Rover 3.87 ECE
Aziz Inan 3.75 EECS
Lu John 3.06 CS
Lee Chow 3.74 EE
Adam Giles 2.54 SS
Andy John 3.98 EECS
You actually do not need awk in the case. Unix sort will sort numerically by column.
Given you input:
$ sort -k 3 -nr grades
Art Pohm 4.00 ECE
Andy John 3.98 EECS
Diane Rover 3.87 ECE
Tom Nelson 3.81 ECE
Aziz Inan 3.75 EECS
Lee Chow 3.74 EE
Ahmad Rashid 3.74 MBA
James Davis 3.71 ECE
Sam Chu 3.68 ECE
Nabeel Ali 3.56 EE
John Doe 3.54 ECE
Sunil Raj 3.36 ECE
Charles Right 3.31 EECS
Lu John 3.06 CS
Arun Roy 3.06 SS
Jake Zulu 3.00 CS
Pat King 2.77 SS
James Adam 2.77 CS
John Clark 2.68 ECE
John Lee 2.64 EE
Al Davis 2.63 CS
Adam Giles 2.54 SS
Rick Marsh 2.34 CS
Then just use head:
$ count=7
$ sort -k 3 -nr grades | head -n $count
Art Pohm 4.00 ECE
Andy John 3.98 EECS
Diane Rover 3.87 ECE
Tom Nelson 3.81 ECE
Aziz Inan 3.75 EECS
Lee Chow 3.74 EE
Ahmad Rashid 3.74 MBA
If you want to use gawk, you would define an array traversal based on an index. You might do something along these lines:
awk -v count=7 'function sort_by_num(i1, v1, i2, v2) {
return (v2-v1)
}
{ lines[NR]=$0
idx[NR]=$3
}
END {
asorti(idx, si, "sort_by_num");
for(n = 1; n <= count; ++n) {
print lines[si[n]]
}
}' grades
Art Pohm 4.00 ECE
Andy John 3.98 EECS
Diane Rover 3.87 ECE
Tom Nelson 3.81 ECE
Aziz Inan 3.75 EECS
Ahmad Rashid 3.74 MBA
Lee Chow 3.74 EE
Note the difference in sort order between sort and the function we have defined in gawk for the last two. You would need to define in your function what you want with the same GPA value. The default is stable for gawk and sort is performing additional comparisons based on other columns. (You can also add the -s switch to sort and the output is identical)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With