I would like to implement a type of version history on my website and I need a way of comparing strings or object keys. For example:
Original string / object key: The quicker brown fox
Revised string / object key: The quick brown fox jumped over the lazy rabbit
Revision: added jumped over the lazy rabbit removed er
I would like to save only the revision in my history table. I don't really know where to start, so any ideas how to get me going or advice on the approach would be really appreciated.
I am aware of the find() function and I suspect it's a prime candidate for use, but I don't know how to visualize it as a solution since it compares strings "wholesale" so to speak.
You want a diffing algorithm (I've tagged the question as such), which I highly recommend you not try to write yourself. I've tried - and failed - as it's a NP complete problem and not easy to wrap your mind around. Instead, check out diff-match-patch, which has a JavaScript and Java implementation for client (demo) or server side processing. If you need to do HTML differencing look at daisydiff instead, albeit be forewarned HTML/XML diffing is truly a painful experience (see this page for some reasons why).
Probably the grand-daddy of diffing is GNU diff, which also has a Java implementation (find "GNU Diff for Java"). This algorithm is more optimized than diff-match-patch (dmp), albeit dmp seems to be improving all the time, so if you need to compare very large strings (e.g. megabytes) the GNU algorithm is probably a better bet.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With