I'm trying to search through phone numbers for any phone number containing some series of digits.
Obviously the following is going to be slow:
Select * from customer where phone like '%1234%'
I need the wildcards, because the users were allowed to enter any data in the database, so it might have country codes, a leading 1 (as in 1-800) or a trailing extension (which was sometimes just separated by a space.
Note: I've already created 'cleaned' phone numbers by removing all non-digit characters, so I don't have to worry about dashes, spaces, and the like.
Is there any magic to get a search like this to run in reasonable time?
If you're using MySQL, you're looking for the Full Text Search functionality http://dev.mysql.com/doc/refman/5.1/en/fulltext-search.html
It specifically optimizes queries like the one you have listed, and is pretty darn fast once set up. You'll need your data in MySQL, and it must be in a MyISAM table (not InnoDB or other.)
I've used it in production and it works pretty well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With