A couple of days ago, in an interview I was asked for a program that would find the single duplicate element in a consecutive integer array in O(log n) time.
The case is somewhat specific, as there are a total of 11 integers (1 to 10, in that order) plus, a single duplicate of any of these numbers, inserted somewhere in between.
I was given a sample similar to this:
{1, 2, 3, 4, 5, 6, 2, 7, 8, 9, 10}
So, now I've come up with the following C code:
#include <stdio.h>
int main (void)
{
int a[11] = {1, 2, 3, 4, 5, 6, 2, 7, 8, 9, 10}, first=0, last=10, mid;
while (1)
{
mid = (first + last)/2;
if (mid+1 == a[mid])
first = mid+1;
else if ((mid == 0) || (mid == 10) || (a[mid+1] - a[mid-1] == 1)) /* Order is Important, (a[mid+1] - a[mid-1] == 1) must be last */
break;
else
last = mid-1;
}
printf("Duplicate element is %d, located at position no. %d.", a[mid], mid+1);
return 0;
}
Would this properly satisfy the O(log n) criteria? And, are there any alternatives/improvements over this?
Yes, it has O(log n) time complexity.
It is possible to make the code more clear using the following fact: you need to find the smallest i such that a[i] != i + 1, so it can be implemented in a more concise way:
//invariant: the [0...low] prefix does not contain a duplicate
// the [0...high] prefix contains a duplicate
low = 0 //[0...0] prefix obviously does not contain a duplicate
high = 10 //the entire array obviously contains a duplicate
while high - low > 1:
mid = (low + high) / 2
if a[mid] != mid + 1:
high = mid
else:
low = mid
print(a[high], high)
We can modify the binary search algorithm to get the solution. In binary search, we have a key and we used this key to find its position by bisecting the array size. Here, we don't have a key, instead we have to find it. But the behavior of duplicate element can be used to bisect the array size. How ? lets see:
On carefully looking the data, we can easily see that after inserting the duplicate element at random position (say index) in consecutive element array the property of elements will change (a[i] == i+1 --> a[i] != i+1) from the position index (including the index). Now this changed property can be used to bisect the array size. Hence, we can find out the duplicate in O(log(n)).
For example, Consider your given array: {1, 2, 3, 4, 5, 6, 2, 7, 8, 9, 10}
{1, 2, 3, 4, 5, 6, 2, 7, 8, 9, 10}
||
from this position the the property of (a[i] == i+1) will no more satisfied.
This property can be model to bisect the array size in solution.
void binary_duplictae_finder(int a[], int low, int high) {
int mid=(low+high)/2;
if(high - low > 1){
if(a[mid]!=mid+1)
binary_duplictae_finder(a, low, mid);
else
binary_duplictae_finder(a, mid, high);
}
if(high==low+1)
printf("%d ", a[high]);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With