I have a 1-indexed array of positive integers, and I want to make several queries to it, all in the form, 'what is the sum of the largest x integers in the subarray 1 to y inclusive?' This array is not sorted and has no particular order.
x and y are not constant and can change between queries. Note: Input to all of these is guaranteed and I don't need to check for input. Time limit is 1 second (about 100 million operations is fine) Note 2: Queries can be offline.
Right now, I'm able to solve this in O(N log N) per query or O(QNlogN) overall just by sorting the whole array while keeping the indices, then taking the first x integers that have indices <= y. However, since I have a lot of queries, this is unfortunately too slow. (Bounds are up to 100,000 for x, y, N and Q, where N is the number of elements in the subarray and Q is the number of queries.)
Code for the too-slow priority_queue solution:
int n; scanf("%d", &n);
priority_queue<pair<int, int>> pq;
for (int i = 1; i <= n; i++) { // scan in the array and init priority queue
int a; scanf("%d", &a);
pq.push({a, i});
}
for (int i = 0; i < q; i++) {
int x, y; scanf("%d %d", &x, &y);
priority_queue<pair<int, int>> temp = pq;
int c = 0, ans = 0;
while (c < x) {
// if the element is not in the desired subarray, remove it
while (temp.top().second > y) temp.pop();
// element has been found
c++;
ans += temp.top().first*2;
temp.pop();
}
printf("%d\n", ans);
}
Is there an efficient way to do this, possibly in O(Q) or O(Q log x) time?
Googling has turned up no answer. I've searched SO (and CS too), and have found nothing similar. Everything I have found has either been on maximum contiguous subarray sum, or maximum sum in a sorted array. I have also looked in an algorithmics book and found nothing.
Partial-sort (see here) does not work, as it is too slow (using it would be O(QNlogN) in worst-case.
We can solve the more general problem of finding the sum of the largest x elements in any subarray (not just a prefix of the array) with a persistent segment tree. The time complexity to build the segment tree at the start is O(N log (MAX_VALUE - MIN_VALUE)) and the time complexity of each query is O(log(MAX_VALUE - MIN_VALUE)) (i.e. the log of the range of the array values). The log of the range of values will generally be quite small (at most 32 for the majority of problems), but it is also simple to reduce the log (MAX_VALUE - MIN_VALUE) factor to log N by applying coordinate compression.
#include <iostream>
#include <vector>
#include <algorithm>
#include <numeric>
#include <climits>
struct Node {
int count;
long long sum;
int left, right;
};
std::vector<Node> tree{{}};
std::vector<int> roots{0};
int minVal = INT_MAX, maxVal = INT_MIN;
int update(int n, int l, int r, int val) {
if (l == r) {
tree.push_back({tree[n].count + 1, tree[n].sum + val});
} else {
int m = std::midpoint(l, r), lc = tree[n].left, rc = tree[n].right;
if (val <= m) lc = update(lc, l, m, val);
else rc = update(rc, m + 1, r, val);
tree.push_back({tree[lc].count + tree[rc].count, tree[lc].sum + tree[rc].sum, lc, rc});
}
return tree.size() - 1;
}
long long query(int nl, int nr, int l, int r, int x) {
if (l == r) return (long long) l * x;
int m = std::midpoint(l, r), rightCount = tree[tree[nr].right].count - tree[tree[nl].right].count;
if (rightCount < x) return tree[tree[nr].right].sum - tree[tree[nl].right].sum + query(tree[nl].left, tree[nr].left, l, m, x - rightCount);
else return query(tree[nl].right, tree[nr].right, m + 1, r, x);
}
// Sum of the largest x elements in the subarray [ql, qr] (1-indexed)
long long querySumOfLargestX(int ql, int qr, int x) {
return query(roots[ql - 1], roots[qr], minVal, maxVal, x);
}
int main() {
std::cin.tie(nullptr);
std::ios_base::sync_with_stdio(false);
int n;
std::cin >> n;
std::vector<int> nums;
nums.reserve(n);
for (int i = 0, x; i < n; i++) {
std::cin >> x;
nums.push_back(x);
minVal = std::min(minVal, x);
maxVal = std::max(maxVal, x);
}
for (const int& num : nums)
roots.push_back(update(roots.back(), minVal, maxVal, num));
int q;
std::cin >> q;
for (int x, y; q--;) {
std::cin >> x >> y;
std::cout << querySumOfLargestX(1, y, x) << '\n';
}
return 0;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With