Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Idiom for data aggregation and post processing in C++

Tags:

c++

A common task in programming is to process data on the fly and, when all data are collected, do some post processing. A simple example for this would be the computation of the average (and other statistics), where you can have a class like this

class Statistic {
public:
  Statistic() : nr(0), sum(0.0), avg(0.0) {}

  void add(double x) { sum += x; ++nr; }
  void process() { avg = sum / nr; }

private:
  int nr;
  double sum;
  double avg;
};

A disadvantage with this approach is, that we always have to remember to call the process() function after adding all the data. Since in C++ we have things like RAII, this seems like a less than ideal solution.

In Ruby, for example, we can write code like this

class Avg
  attr_reader :avg

  def initialize
    @nr = 0
    @sum = 0.0
    @avg = nil
    if block_given?
      yield self
      process
    end
  end

  def add(x)
    @nr += 1
    @sum += x.to_f
  end

  def process
    @avg = @sum / @nr
  end
end

which we then can call like this

avg = Avg.new do |a|
  data.each {|x| a.add(x)}
end

and the process method is automatically called when exiting the block.

Is there an idiom in C++ that can provide something similar?

For clarification: this question is not about computing the average. It is about the following pattern: feeding data to an object and then, when all the data is fed, triggering a processing step. I am interested in context-based ways to automatically trigger the processing step - or reasons why this would not be a good idea in C++.

like image 571
Eigentime Avatar asked Feb 28 '26 06:02

Eigentime


1 Answers

"Idiomatic average"

I don't know Ruby but you can't translate idioms directly anyhow. I know that calculating the average is just an example, so lets see what we can get from that example...

Idiomatic way to caclulate sum, and average of elements in a container is std::accumulate:

std::vector<double> data;
// ... fill data ...
auto sum = std::accumulate( a.begin(), a.end() , 0.0);
auto avg = sum / a.size();

The building blocks are container, iterator and algorithms.

If you do not have elements to be processed readily available in a container you can still use the same algorithms, because algorithms only care about iterators. Writing your own iterators requires a bit of boilerplate. The following is just a toy example that calcualtes average of results of calling the same function a certain number of times:

#include <numeric>

template <typename F>
struct my_iter {
    F f;
    size_t count;
    my_iter(size_t count, F f) : count(count),f(f) {}
    my_iter& operator++() {
        --count;
        return *this;
    }
    auto operator*() { return f(); }
    bool operator==(const my_iter& other) const { return count == other.count;}
};


int main()
{
    auto f = [](){return 1.;};
    auto begin = my_iter{5,f};
    auto end = my_iter{0,f};
    auto sum = std::accumulate( begin, end, 0.0);
    auto avg = sum / 5;
    std::cout << sum << " " << avg;
}

Output is:

5 1

Suppose you have a vector of paramters for a function to be called, then calling std::accumulate is straight-forward:

#include <iostream>
#include <vector>
#include <numeric>

int main()
{
    auto f = [](int x){return x;};
    std::vector<int> v = {1,2,5,10};
    
    auto sum = std::accumulate( v.begin(), v.end(), 0.0, [f](int accu,int add) {
        return accu + f(add);
    });
    auto avg = sum / 5;
    std::cout << sum << " " << avg;
}

The last argument to std::accumulate specifies how the elements are added up. Instead of adding them up directly I add up the result of calling the function. Output is:

18 3.6

For your actual question

Taking your question more literally and to answer also the RAII part, here is one way you can make use of RAII with your statistic class:

struct StatisticCollector {
private:
    Statistic& s;
public:
    StatisticCollector(Statistic& s) : s(s) {}
    ~StatisticCollector() { s.process(); }
};

int main()
{
    Statistic stat;
    {
        StatisticCollector sc{stat};
        //for (...)
        // stat.add( x );
    } // <- destructor is called here

}

PS: Last but not least there is the alternative to just keep it simple. Your class definition is kinda broken, because all results are private. Once you fix that, it is kinda obvious that you need no RAII to make sure process gets called:

class Statistic {
public:
  Statistic() : nr(0), sum(0.0), avg(0.0) {}

  void add(double x) { sum += x; ++nr; }
  double process() { return sum / nr; }
private:
  int nr;
  double sum;
};

This is the right interface in my opinion. The user cannot forget to call process because to get the result they need to call it. If the only purpose of the class is to accumulate numbers and process the result it should not encapsulate the result. The result is for the user of the class to store.

like image 114
463035818_is_not_a_number Avatar answered Mar 02 '26 20:03

463035818_is_not_a_number



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!