Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Initializing vector in multithread

I have a large vector std::vector<some_class> some_vector which I need to initialize with different values of constructors some_class(constructor_parameters). The normal way to do it would be something like:

std::vector<some_class> some_vector
some_vector.reserve(length)

for (...) some_vector.push_back(some_class(constructor_parameters))

But because this vector is large, I want to do this in parallel. Is there any way to split the vector and push_back at q different position of the vector so each thread can start initializing a different part of the vector?

I read some answers with splitting / joining vector and haven`t found anything useful. As my vector is really large I have to avoid something like creating new vector for each thread and then copying them into the original one - I can use only one big chunk of memory.

I tried to use some_vector.at(some_loc) = some_class(constructor_parameters) but this isn`t working with uninitialized vector.

I can initialize vector to some dump values and then use at to initiaize it to proper values, but it is not efficient.

So my question - how to efficiently (in terms of memory consumption and computing time) initialize a large vector?

EDIT: to answer comments:

Size - the container doesn`t change its size during the run of program, but the size is not known at compiling time. The size is huge because that's just the scope of the problem - I'm performing cosmological N-body simulation where number of particles / mesh cells can be easily 1024^3 and more.

Ctors - now they are just assigning values to class member (3 ~ 7 assignments) but I was planning to add some computation

Members - are easily coppyable, typically 2 std::vector(3)

Why vectors - I was originally using only basic type arrays and new / delete directive. I wanted to use vector because of their various functionalities, automatic memory (de)allocating, easier loop using iterators, etc. I just assumed that they should be easy to implement into multi thread with all their other good properties...

like image 424
Michal Avatar asked Dec 03 '25 10:12

Michal


1 Answers

For general types T, the problem with what you describe is that it takes a fair amount of state to track which of the T have been constructed, and which have not.

If you compress the "is this a valid value" data into a bitfield, checking for validity is a very cache incoherent.

One easy approach is a vector<optional<T>> in C++17 or with boost. Pre-size (to nullopt), then use emplace to construct the terms in whatever thread you want.

Finally, consider not using a single vector. Write a wrapper that splices multiple vectors together into one visible container.

like image 77
Yakk - Adam Nevraumont Avatar answered Dec 05 '25 00:12

Yakk - Adam Nevraumont



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!