Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Diff Two Multiline Strings Like GitHub

I want to achieve a diff output like github's commit diff view. And I tried this:

import difflib

first = """
def
baz
"""

second = """
deff
ba
bar
foo
"""

diff = ''
for text in difflib.unified_diff(first, second):
    for prefix in ('---', '+++', '@@'):
        if text.startswith(prefix):
            break
    else:
        diff += text

The output is:

 d e f+f 
 b a-z 
+b+a+r+
+f+o+o+

How can I achieve,

1 def+f
2 ba-z
+
3 bar
4 foo
# -
# 5 line
# 6 line

an output just like this. Thanks.


1 Answers

I'm not quite sure what format you mean with gitlab; I've not seen char-by-char diffs in gitlab like your example. If you want a more standardish line-by-line output, then I think you just have to pass lists to the diff function:

for text in difflib.unified_diff(first.split("\n"), second.split("\n")):
    if text[:3] not in ('+++', '---', '@@ '):
        print text

As every line is different in your example, diff is just going to see each line as having been totally changed and give you an output like:

-def
-baz
+deff
+ba
+bar
+foo

If you want to do something more fancy, you can treat the data as a single string (as you were) and then try and split on new-lines. The return format seems to be "{operation}{char}" (including new line chars), so you can group and detect lines which all have the same operation and apply the correct logic.

I can't quite work out the rules you're trying to apply based on your example (are you grouping all mixed lines, then added lines then removed lines or something else?), so I can't give you an exact example.

like image 190
Jon Betts Avatar answered May 20 '26 10:05

Jon Betts



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!