Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best Fuzzy Matching Algorithm? [closed]

Tags:

fuzzy-search

What is the best Fuzzy Matching Algorithm (Fuzzy Logic, N-Gram, Levenstein, Soundex ....,) to process more than 100000 records in less time?

like image 217
Dhanapal Avatar asked Jan 29 '09 10:01

Dhanapal


People also ask

Which is the best pattern matching algorithm?

The Karp-Rabin Algorithm.

What is fuzzy Wuzzy algorithm?

FuzzyWuzzy is a library of Python which is used for string matching. Fuzzy string matching is the process of finding strings that match a given pattern. Basically it uses Levenshtein Distance to calculate the differences between sequences.

Is fuzzy Wuzzy slow?

FuzzyWuzzy package is a Levenshtein distance based method which widely used in computing similarity scores of strings. But why we should not use it? The answer is simple: it is way too slow. The estimated time of computing similarity scores for a 406,000-entity dataset of addresses is 337 hours.

What is fuzzy matching example?

Fuzzy Matching (also called Approximate String Matching) is a technique that helps identify two elements of text, strings, or entries that are approximately similar but are not exactly the same. For example, let's take the case of hotels listing in New York as shown by Expedia and Priceline in the graphic below.


1 Answers

I suggest you read the articles by Navarro mentioned in the Refences section of the Wikipedia article titled Approximate string matching. Making your decision based on actual research is always better than on suggestions by random strangers.. Especially if performance on a known set of records is important to you.

like image 169
Tim Avatar answered Oct 10 '22 18:10

Tim