Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace complex ID with number?

Tags:

awk

I have a file with multiple entries for each ID number. The file has about 2,000 ID's with 54,000 observations per ID. I need to feed the output into an algorithm that requires ID's to be less than 6 characters. How can I replace the ID's with just the numbers one to 2000? ID in the file looks like this:

2007I804567
2007I804567
2007I804567
2007I804568
2007I804568
2007I804568
2007I804569
2007I804569
2007I804569

Need it to look like this (want to keep the ID):

1 2007I804567
1 2007I804567
1 2007I804567
2 2007I804568
2 2007I804568
2 2007I804568
3 2007I804569
3 2007I804569
3 2007I804569

Thanks

like image 790
Justin Buchanan Avatar asked Oct 19 '25 04:10

Justin Buchanan


2 Answers

$ cat file
2007I804567
2007I804567
2007I804567
2007I804568
2007I804568
2007I804568
2007I804569
2007I804569
2007I804569
$ 
$ awk '!seen[$0]++{++id} {print id, $0}' file
1 2007I804567
1 2007I804567
1 2007I804567
2 2007I804568
2 2007I804568
2 2007I804568
3 2007I804569
3 2007I804569
3 2007I804569
like image 58
Ed Morton Avatar answered Oct 22 '25 05:10

Ed Morton


Try following awk

awk '!($0 in id) {id[$0]=++n} {print id[$0], $0}' file

Short Description

awk '
    !($0 in id) {             # if line is not present in array 'id'
         id[$0]=++n           # assign unique ID of a line to incremental number i.e. create an array of id with line a key 
    } 
    {
        print id[$0], $0      # print corresponding ID along with line content
    }' file                   # input file
like image 31
jkshah Avatar answered Oct 22 '25 05:10

jkshah



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!