Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the fastest way to check if a username is available with a huge dataset?

Tags:

php

mysql

redis

I'm looking for the fastest/most efficient way to search if a given username is available from a set of tens of millions of usernames. At the moment I'm using a normal MySQL SELECT query that runs every key press, but I'm not happy with the performance. I'm using indexing, partitioning, etc and I know that MySQL can be optimized to be very fast, but I also know that there are better solutions.

So what's the fastest username search:

  • Redis EXISTS command
  • Elasticsearch
  • Something else

Ex: how does Gmail search across billions of email addresses when registering. How does Facebook do it? I assume they don't just run an SQL query.

I'm looking for a practical solution for a PHP app.

Right now I'm just using a very basic select:

SELECT username FROM users WHERE username = $username LIMIT 1

The username column has a unique index on it

like image 578
tomschmidt Avatar asked Nov 15 '25 22:11

tomschmidt


1 Answers

I agree you should try and stick it all in RAM (e.g. Redis).

But if you don't want to go the whole way, I do the following: store the list somewhere slow (e.g. S3 or a SQL database). Next, make a Bloom filter (there stuff on wikipedia on that, and there's a nifty Redis module that you can use - https://oss.redislabs.com/redisbloom) from that list.

Now, BF tells will never give you a false negative so you can efficiently check with it whether a username is available. Sometimes, however, the BF will report a username as unavailable (false positive) and you have decide if you can live with that.

like image 127
Itamar Haber Avatar answered Nov 18 '25 11:11

Itamar Haber



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!