Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare two files & output differences (including Line Number and content) from both files

Tags:

file

compare

awk

I am attempting to get the differences of both files, line number and content either in another file or stdout. I have attempted the below, yet, not able to get the exact desired output. Please see below.

File contents:

File1:

Col1,Col2,Col3
Text1,text1,text1
Text2,text2,Rubbish

File2:

Col1,Col2,Col3
Text1,text1,text1
Text2,text2,text2
Text3,text3,text3

I have tried the following code which does not provide the exact desired output as it only shows the difference in the first file and not the extra line in file2.

sort file1 file2 | uniq | awk 'FNR==NR{ a[$1]; next } !($1 in a) {print FNR": "$0}' file2 file1

Output

3: Text2,text2,Rubbish

Desired Output

3: Text2,text2,Rubbish (File1)
3: Text2,text2,text2 (File2)
4: Text3,text3,text3 (File2)

I DONOT wish to use diff/sdiff/comm because of the outputs, as I cannot add line number and organise the data side by side for ease of reading. Normal files would be in excess of 1000 lines so diff/sdiff utilities become more difficult to read.

like image 655
test1 1 Avatar asked Dec 06 '25 06:12

test1 1


1 Answers

With your shown samples, please try following awk code. Written and tested in GNU awk.

awk '
BEGIN { OFS=": " }
FNR==1{ next     }
FNR==NR{
  arr[$0]=FNR
  next
}
!($0 in arr){
  print FNR,$0" ("FILENAME")"
  next
}
{
  arr1[$0]
}
END{
  for(key in arr){
    if(!(key in arr1)){
      print arr[key],key" ("ARGV[1]")"
    }
  }
}
' file1 file2

Explanation: Adding detailed explanation for above.

awk '                                   ##Starting awk program from here.
BEGIN { OFS=": " }                      ##Setting OFS to colon space in BEGIN section of this program.
FNR==1{ next     }                      ##Skipping if there is FNR==1 for both the files.
FNR==NR{                                ##Checking condition if FNR==NR then do following.
  arr[$0]=FNR                           ##Creating arr with index of current line has FNR as value.
  next                                  ##Will skip all further statements from here.
}
!($0 in arr){                           ##If current line is NOT in arr(to get lines which are in file2 but not in file1)
  print FNR,$0" ("FILENAME")"           ##Printing as per OP request number with file name, line.
  next                                  ##Will skip all further statements from here.
}
{
  arr1[$0]                              ##Creating arr1 which has index as current line in it.
}
END{                                    ##Starting END section of this program from here.
  for(key in arr){                      ##Traversing through arr here.
    if(!(key in arr1)){                 ##If key is NOT present in arr1.
      print arr[key],key" ("ARGV[1]")"   ##Printing values of arr and first file name, basically getting lines which are present in file1 and NOT in file2.
    }
  }
}
' file1 file2                           ##Mentioning Input_file names here.
like image 131
RavinderSingh13 Avatar answered Dec 09 '25 04:12

RavinderSingh13