I have a text file called file.txt like below,
01_ABC_0000 AA
02_CDE_0000 BB
03_EFG_0000 CC
04_ABC_0001 DD
05_CDE_0001 EE
06_EFG_0001 FF
where it should separated into two different files, like file0.txt
01_ABC_0000 AA
02_CDE_0000 BB
03_EFG_0000 CC
and file1.txt
04_ABC_0001 DD
05_CDE_0001 EE
06_EFG_0001 FF
what i have been trying,
cat file00.txt | awk '{print $1}' | sed 's/.*\(....\)/\1/') to get the only numbers from first word but I am not able use this to go forward separating it into the two files.
Any help is much appreciated.
EDIT: Sorry, I have edited the question where the first field is 01_ABC_0000 something like this, If I use field separator as underscore it doesn't work as expected.
1st solution: Considering your entries are sorted with values of 2nd column(0000, 00001 and so on). With your shown samples, please try following awk program.
awk -v count="1" -F'_| +' '
prev!=$2{
count++
close(outputFile)
outputFile=("file"count".txt")
prev=$2
}
{
print > (outputFile)
}
' Input_file
2nd solution: Using sort + awk combination solution in case entries are not sorted.
awk -F'_| +' '{print $2,$0}' Input_file |
sort -nk1 |
awk -v count="1" -F'_| +' '
{
sub(/^[^[:space:]]+[[:space:]]+/,"")
}
prev!=$2{
count++
close(outputFile)
outputFile=("file"count".txt")
prev=$2
}
{
print > (outputFile)
}
'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With