I have a list of unstructured postal address strings and structured postal address strings. What should I use to compare these?
Example addresses:
Unstructured: john appartments 7 koramangala bangalore india 560066
structured: 7, john appartments, koramangala, bangalore-560066, india
If you are limited to finding the likelihood of these strings being similar then you need to look into techniques mentioned here Finding groups of similar strings in a large set of strings
Another approach - if you have access to maps/dictionaries then you can "structurize" any address (by finding the name of the country, postal code, street name, etc.) and then compare.
Good luck
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With