I have
chr pos C T A G
NC_044998.1 3732 21 0 0 0
NC_044998.1 3733 22 0 2 0
NC_044998.1 3734 22 0 5 0
NC_044998.1 3735 22 0 0 0
NC_044998.1 3736 0 0 7 0
NC_044998.1 3737 0 0 0 22
NC_044998.1 3738 20 0 0 0
NC_044998.1 3739 1 0 22 0
NC_044998.1 3740 0 22 0 0
NC_044998.1 3741 22 0 0 0
I need to output the max value in $3 to $7 per line as well as the column name associated with it.
so that I have
chr pos max ref
NC_044998.1 3732 21 C
NC_044998.1 3733 22 C
NC_044998.1 3734 22 C
NC_044998.1 3735 22 C
NC_044998.1 3736 7 A
NC_044998.1 3737 22 G
NC_044998.1 3738 20 C
NC_044998.1 3739 22 A
NC_044998.1 3740 22 T
NC_044998.1 3741 22 C
I'm trying to adapt this:
awk 'NR == 1 {for (c = 3; c <= NF; i++) headers[c] = $c; next} {maxc=3;for(c=4;c<=NF;c++)if($c>$maxc){maxc=c} printf "max:%s, %s\n", $maxc, headers[maxc]}'
but it just output this max value
also have tried
awk '{maxc=3;for(c=4;c<=NF;c++)if($c>$maxc){maxc=c; $maxc = headers[c]} printf "max:%s, column:%s, column:%s\n",$maxc, maxc, headers[maxc]}'
Another issue I'm trying to figure is in cases where there's a tie between one or more columns. In that case I would like to print the max and the names of all columns associated.
With your shown samples, please try following awk code, written and tested in GNU awk.
awk -v startField="3" -v endField="6" '
FNR==1{
for(i=startField;i<=endField;i++){
heading[i]=$i
}
next
}
{
max=maxInd=""
for(i=startField;i<=endField;i++){
maxInd=(max<$i?i:maxInd)
max=(max<$i?$i:max)
}
NF=(startField-1)
print $0,heading[maxInd]
}
' Input_file
Advantages of this code's approach:
startField and endField so we need NOT to change anything inside main awk code.Detailed explanation: Adding detailed explanation for above.
awk -v startField="3" -v endField="6" ' ##Starting awk program and setting startField and endField to values on which user wants to look for maximum values.
FNR==1{ ##Checking condition if this is first line of Input_file.
for(i=startField;i<=endField;i++){ ##Traversing through only those fields which user needs to get max value.
heading[i]=$i ##Creating array heading whose index is i and value is current field value.
}
next ##next will skip all further statements from here.
}
{
max=maxInd="" ##Nullifying max and maxInd variables here.
for(i=startField;i<=endField;i++){ ##Traversing through only those fields which user needs to get max value.
maxInd=(max<$i?i:maxInd) ##Getting maxInd variable to current field number if current field value is greater than maxInd else keep it as maxInd itself.
max=(max<$i?$i:max) ##Getting max variable to current field value if current field value is greater than max else keep it as max itself.
}
NF=(startField-1) ##Setting NF(number of fields of current line) to startField-1 here.
print $0,heading[maxInd] ##printing current field followed by heading array value whose index is maxInd.
}
' Input_file ##Mentioning Input_file name here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With