Here is my structure
struct Node
{
int chrN;
long int pos;
int nbp;
string Ref;
string Alt;
};
to fill the structure I read though a file and pars my interest variable to the structure and then push it back to a vectore. The problem is, there are around 200 millions items and I should keep all of them at memory (for the further steps)! But the program terminated after pushing back 50 millions of nodes with bad_allocation error.
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
searching around give me the idea I'm out of memory! but the output of top shows %48 (when the termination happened)
Additional information which may be useful: I set the stack limitation unlimit and I'm using Ubuntu x86_64 x86_64 x86_64 GNU/Linux with 4Gb RAM.
Any help whuold be most welcome.
Update:
1st switch from vector to list, then store each ~500Mb at file and index them for further analyses.
Vector storage is contiguous, in this case, 200 mio * the sizeof the struct bytes are required. For each of the strings in the struct, another mini allocation may be needed to hold the string. All together, this is not going to fit your available address space, and no (non-compressing) data structure is going to solve this.
Vectors usually grow their backing capacity exponentially (which amortizes the cost for push_back). So when your program was already using about half the available address space, the vector probably attempted to double its size (or add 50%), which then caused the bad_alloc and it did not free the previous buffer, so the final memory appears to be only 48%.
That node structure consumes up to 44 bytes, plus the actual string buffers. There's no way 200 million of them will fit in 4 GB.
You need to not hold your entire dataset in memory at once.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With