Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Unique 7 character string" - related issue

Tags:

string

php

unique

I read several topics similar to what I'm asking, but none of them seemed to be very helpful to me.

I have a form where users can generate codes that are stored in a column with a Unique constraint. The codes are strings with length of 7 characters. The users can enter a number and the program generates that many codes, and this can be repeated until the maximum number of codes is reached.

My problem is with duplicate values. But not with values that are already present in the database in the moment of entering new entries(I check for those successfully), but some of the entries in the new group of (say 10000) codes are (probably) identical. So my code generates two(or more) identical codes in the same transaction and the Unique constraint in the DB complains about it.

I thought of checking the database after each entry, but it is extremely time consuming, considering we're talking about 10000 or sometimes more entries.

So now I think the only option is to modify the code that generates them in the first place, cause it seems to be inefficient and generate doubles.

A big part of the problem is the required length of the codes, otherwise I would go with pure 'uniqid()' or something similar, but since I have to restrict it to 7 characters I guess that makes it a lot worse. Also, I have to exclude some characters from the code[labeled 'problem_characters'] in the code.

Here's the code, I couldn't modify it properly to generate unique values only.

$problem_characters = array("0", "o", "O", "I", "1", 1);

$code = md5(uniqid(rand(), true));

$extId = strtoupper(str_replace($problem_characters,rand(2,9),substr($code, 0, 7)));

//insert $extId in the database

@Geo Ok, I tried your solution and it was working (of course), but then I got a new problem - in the 'else' part of your 'if' I'm doing the following:

$extId = strtoupper(str_replace($problem_characters,rand(2,9),substr($code, 0, 7)));

while(true){     

      if((!in_array($extId, $allExternalIdsHandled)) && (!in_array($extId, $newEnteredValues))){
       break;
        }else{
 $extId = strtoupper(str_replace($problem_characters,rand(2,9),substr($code, 0, 7)));   }
               }
//insert the modified value in the DB here

So, now it's entering an endless loop and it's not breaking out with the 'break' command even though it ought to be changed with the execution of the 'random' call and then enter the if and break out...

I do not see the problem here. Can someone give me some direction, please?

EDIT: It sometimes hangs, sometimes does not. I just entered 10000 values and got two entries modified via the 'else' path. I observed this using logs.

like image 740
thebloodycoon Avatar asked Jan 25 '26 09:01

thebloodycoon


1 Answers

There are already libraries doing the hard work for you, allowing you to select the "alphabet" to use when generating the string and the length of the string.

Your "identical entries" problem is called a collision and it can't be avoided.

Edit So, similar to what was suggested by Geo, I'm using a PHP to create a list of n unique entries. The difference is that the SQL insert might fail, so I'm having 2 layers of iteration to make sure that we fill the total number desired:

<?php

require('hashids.php'); // I'm using the library I suggested

$hashids = new hashids('some salt', 7); // use the default alphabet, feel free to pass the 3rd parameter with the alphabet you want to use

$generationTries = 0;

$hashesInDBCount = 0; // get from your database
$desiredHashesCount = 50; // use a parameter
$totalDesiredHashes = $hashesInDBCount + $desiredHashesCount;
do
{
    // when coming back in the loop, only generate what's still required
    $desiredHashesCount = $totalDesiredHashes - $hashesInDBCount; 
    $generatedHashesCount = 0;
    $generatedHashes = array();

    while($generatedHashesCount < $desiredHashesCount)
    {
        $hash = $hashids->encrypt($generationTries++);
        if(!in_array($hash, $generatedHashes))
        {
            array_push($generatedHashes, $hash);
            ++$generatedHashesCount;
        }
    }

    // insert $generatedHashes in your Database

    $hashesInDBCount = 50; // again, query your database as you might come through this loop more than once, 
                           // I'm hardcoding the value to have a working example
}
while($hashesInDBCount < $totalDesiredHashes);

echo "Generated " . count($generatedHashes) . " hashes in " . $generationTries . " tries\n";
var_dump($generatedHashes);

Which gives me an aoutput like:

Generated 50 hashes in 50 tries
array(50) {
  [0]=>
  string(7) "eAcgAcx"
  [1]=>
  string(7) "Exidai8"
  [2]=>
  string(7) "ExTbqT8"
  [3]=>
  string(7) "4Acz8cB"
  [4]=>
  string(7) "LRipxir"
  [5]=>
  string(7) "zATe5Tx"
  ...
}

Adding a random salt will give you random values every time

like image 97
emartel Avatar answered Jan 27 '26 22:01

emartel



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!