Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Declaring array data inside a class C++

I am creating a class that needs to store different arrays like data. Those arrays will have mutable size, but all of the arrays inside the class will have the same size. The arrays will be later used for number crunching in methods provided by the class.

What is the best/standard way of declaring that kind of data inside the class?

Solution 1 – Raw arrays

class Example {
    double *Array_1;
    double *Array_2;
    double *Array_3;
    int size; //used to store size of all arrays
};

Solution 2 – std::vector for each array

class Example {
    vector<double> Array_1;
    vector<double> Array_2;
    vector<double> Array_3;
};

Solution 3 – A struct that stores each vertex and have a std::vector of that struct

struct Vertex{
    double Var_1;
    double Var_2;
    double Var_3;
};
class Example {
    vector<Vertex> data;
};

My conclusion as a beginner would be:

Solution 1 would have the best performance but would be the hardest to implement.

Solution 3 would be elegant and easier to implement, but I would run into problems when performing some calculations because the information would not be in an array format. This means numeric regular functions that receive arrays/vectors would not work (I would need to create temporary vectors in order to do the number crunching).

Solution 2 might be the middle of the way.

Any ideas for a 4th solution would be greatly appreciated.

like image 598
Glayson Patricio Avatar asked Jan 20 '26 18:01

Glayson Patricio


2 Answers

Don't use raw arrays. Options 2 and 3 are reasonable, the difference depends on how you'll be traversing the data. If you'll frequently be going through the arrays individually, you should store them as in solution #2 because each vector will be stored contiguously in memory. If you'll be going through them as sets of points, then solution 3 is probably better. If you want to go with solution #2 and it's critical that the arrays always be synchronized (same size, etc.) then I would make them private and control access to them through member functions. Example:

class Example
{
private:
    vector<double> Array_1;
    vector<double> Array_2;
    vector<double> Array_3;

public:
    void Push_data(double val1, double val2, double val3) {
        Array_1.push_back(val1);
        Array_2.push_back(val2);
        Array_3.push_back(val3);
    }

    vector<double> Get_all_points_at_index(size_t index) const {
        if (index < Array_1.size())
            return {Array_1[index], Array_2[index], Array_3[index]};
        else
            throw std::runtime_error("Error: index out of bounds");
    }

    const vector<double>& Get_array1() const {
        return Array_1;
    }

    void Clear_all() {
        Array_1.clear();
        Array_2.clear();
        Array_3.clear();
    }
};

This way, users of the class aren't burdened with the responsibility of making sure they add/remove values from all the vectors evenly - you do that with your class's member functions where you have complete control over the underlying data. The accessor functions should be written such that it's impossible for a user (including you) to un-syncronize the data.

like image 120
Carlton Avatar answered Jan 22 '26 11:01

Carlton


If you are going to process big amounts of data, then solutions 1 and 2 are pretty much the same - the only meaningful difference is that solution 1 is hard to protect against memory leaks (while solution 2 deallocates your data when needed automatically).

The difference between solutions 2 and 3 is what people often call "Structure of arrays" vs "Array of structures". The runtime efficiency of these solutions depends on what your code does with them. The general principle is locality of reference. If your code frequently does number crunching only on the first component of your vertex data, then use structure of arrays (solution 2). However, any complex code will work on all of the data, so I guess solution 3 (array of structures) is the best.

Note that this example is rather pure. If your data contains elements that are sometimes used in number crunching and sometimes not (e.g. it does some transformation on two coordinates of the vertices, while leaving the third untouched), then you might need to implement some kind of in-between solution - copy only the needed data to some place, transform it and copy the results back.

like image 31
anatolyg Avatar answered Jan 22 '26 09:01

anatolyg