Given an array of N numbers (not necessarily sorted). We can merge any two numbers into one and the cost of merging the two numbers is equal to the sum of the two values. The task is to find the total minimum cost of merging all the numbers.
Example:
Let the array A = [1,2,3,4]Then, we can remove 1 and 2, add both of them and keep the sum back in array. Cost of this step would be (1+2) = 3.
Now, A = [3,3,4], Cost = 3
In second step, we can 3 and 3, add both of them and keep the sum back in array. Cost of this step would be (3+3) = 6.
Now, A = [4,6], Cost = 6
In third step, we can remove both elements from the array and keep the sum back in array again. Cost of this step would be (4+6) = 6.
Now, A = [10], Cost = 10
So, total cost turns out to be 19 (10+6+3).
We will have to pick the 2 smallest elements to minimize our total cost. A simple way to do this is using a min heap structure. We will be able to get the minimum element in O(1) and insertion will be O(log n).
The time complexity of this approach is O(n log n).
But I tried another approach, and wasn't able to find the cases where it fails. The basic idea was that the sum of two smallest elements that we will choose at any time will always be greater than the sum of the pair of elements chosen before. So the "temp" array will always be sorted, and we will be able to access the minimum elements in O(1).
As I am sorting the input array and then simply traversing the array, the complexity of my approach is O(n log n).
int minCost(vector<int>& arr) {
sort(arr.begin(), arr.end());
// temp array will contain the sum of all the pairs of minimum elements
vector<int> temp;
// index for arr
int i = 0;
// index for temp
int j = 0;
int cost = 0;
// while we have more than 1 element combined in both the input and temp array
while(arr.size() - i + temp.size() - j > 1) {
int num1, num2;
// selecting num1 (minimum element)
if(i < arr.size() && j < temp.size()) {
if(arr[i] <= temp[j])
num1 = arr[i++];
else
num1 = temp[j++];
}
else if(i < arr.size())
num1 = arr[i++];
else if(j < temp.size())
num1 = temp[j++];
// selecting num2 (second minimum element)
if(i < arr.size() && j < temp.size()) {
if(arr[i] <= temp[j])
num2 = arr[i++];
else
num2 = temp[j++];
}
else if(i < arr.size())
num2 = arr[i++];
else if(j < temp.size())
num2 = temp[j++];
// appending the sum of the minimum elements in the temp array
int sum = num1 + num2;
temp.push_back(sum);
cost += sum;
}
return cost;
}
Is this approach correct? If not, please let me know what I am missing, and the test cases in which this algorithm fails.
SPOJ Link for the same problem
The logic seems very solid to me... all the computed sums will never be decreasing and therefore you only need to add up either oldest two computed sums, next two elements or oldest sum and next element.
I would just simplify the code:
#include <vector>
#include <algorithm>
#include <stdio.h>
int hsum(std::vector<int> arr) {
int ni = arr.size(), nj = 0, i = 0, j = 0, res = 0;
std::sort(arr.begin(), arr.end());
std::vector<int> temp;
auto get = [&]()->int {
if (j == nj || (i < ni && arr[i] < temp[j])) return arr[i++];
return temp[j++];
};
while ((ni-i)+(nj-j)>1) {
int a = get(), b = get();
res += a+b;
temp.push_back(a + b); nj++;
}
return res;
}
int main() {
fprintf(stderr, "%i\n", hsum(std::vector<int>{1,4,2,3}));
return 0;
}
Very nice idea!
Another improvement is noting that the cumulative length of the two arrays being processed (the original one and the temporary one holding the sums) will decrease at every step. Since the first step will use two input elements, the fact that the temporary array grows one element at each step will still not be enough for a "walking queue" allocated in the array itself to reach the reading pointer. This means that there is no need of a temporary array and the space for the sums can be found in the array itself...
int hsum(std::vector<int> arr) {
int ni = arr.size(), nj = 0, i = 0, j = 0, res = 0;
std::sort(arr.begin(), arr.end());
auto get = [&]()->int {
if (j == nj || (i < ni && arr[i] < arr[j])) return arr[i++];
return arr[j++];
};
while ((ni-i)+(nj-j)>1) {
int a = get(), b = get();
res += a+b;
arr[nj++] = a + b;
}
return res;
}
About the error on SPOJ... I tried briefly to search for the problem but I didn't succeed. I tried however generating random arrays of random lengths and checking this solution with what finds a "brute-force" one implemented directly from the specs and I'm reasonably confident that the algorithm is correct.
I know at least one programming arena (Topcoder) where sometimes the problems are carefully crafted so that the computation gives correct results if using unsigned but not if using int (or if using unsigned long long but not if using long long) because of integer overflow.
I don't know if SPOJ also does this kind of nonsense(1)... may be that is the reason some hidden test case fails...
Checking with SPOJ the algorithm passes if using long long values... this is the entry I used:
#include <stdio.h>
#include <algorithm>
#include <vector>
int main(int argc, const char *argv[]) {
int n;
scanf("%i", &n);
for (int testcase=0; testcase<n; testcase++) {
int sz; scanf("%i", &sz);
std::vector<long long> arr(sz);
for (int i=0; i<sz; i++) scanf("%lli", &arr[i]);
int ni = arr.size(), nj = 0, i = 0, j = 0;
long long res = 0;
std::sort(arr.begin(), arr.end());
auto get = [&]() -> long long {
if (j == nj || (i < ni && arr[i] < arr[j])) return arr[i++];
return arr[j++];
};
while ((ni-i)+(nj-j)>1) {
long long a = get(), b = get();
res += a+b;
arr[nj++] = a + b;
}
printf("%lli\n", res);
}
return 0;
}
PS: This very kind of computation is also what is needed to build an Huffman tree for entropy coding given the symbols frequency table and thus it's not a mere random exercise but it has practical applications.
(1) I'm saying "nonsense" because in Topcoder they never give problems that require 65 bits; thus it's not a genuine care about overflows, but just setting traps for novices. Another that I think is a bad practice I saw on TC is that some problems are carefully designed so that the correct algorithm if using C++ will barely fit in the timeout limit: just use another language (and get e.g. a 2× slowdown) and you cannot solve the problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With