Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm to compare people names to detect identicalness

I am working on address book synchronization algorithm. I would like to reuse some code if there exists, but couldn't find one yet.

Does someone know about an algorithm that will tell me in numbers/float/procent how much two names are identical. Levenstein distance is not good in this approach, as names and our adddress books are matching the begining of each of the name sections.

John Smith should match
Smith Jon, Jonathan Smith, Johnny Smith

like image 729
Pentium10 Avatar asked Sep 20 '25 03:09

Pentium10


1 Answers

Have a look at the Jaro Winkler algorithm too. It is good for names. http://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance

If you have first name, last name issues then you could just sort them to make sure Smith John is saved as John Smith

like image 160
Andrew White Avatar answered Sep 22 '25 17:09

Andrew White