I have a text file (66GB) where I would like to replace some characters. I can't load the whole thing into memory.
This is the basic idea of what I was hoping to do:
std::ifstream i(infile.c_str()); // ifsteam
while(i.good()) {
getline(i, line);
for(int c=0;c<line.length();c++) {
if(line[c]=='Q')
// *** REPLACE Q WITH X HERE
}
}
My question is: how do I put the new character so that is actually replaces Q?
Subquestion: is there a better / faster way to do this?
I am working on a virtual ubuntu server: 2 Cores, 4GB of memory, OS is ubuntu.
You could use something like this which I would think would be faster.
std::ifstream ifs("input_file_name", std::ios::binary);
std::ofstream ofs("output_file_name", std::ios::binary);
char buf[4096]; // larger = faster (within limits)
while(ifs.read(buf, sizeof(buf)) || ifs.gcount())
{
// replace the characters
std::replace(buf, buf + ifs.gcount(), 'Q', 'X');
// write to a new file
ofs.write(buf, ifs.gcount());
}
If you don't want to produce a separate file (more dangerous) then you could modify the original file something a bit like this (untested code):
std::fstream fs("input_file_name", std::ios::in|std::ios::out|std::ios::binary);
char buf[4096]; // larger = faster (within limits)
auto beg = fs.tellg();
while(fs.read(buf, sizeof(buf)) || fs.gcount())
{
auto end = fs.tellg();
// replace the characters
std::replace(buf, buf + fs.gcount(), 'Q', 'X');
// return to start of block
fs.seekp(beg);
// overwrite this block
fs.write(buf, fs.gcount());
// shift old beginning to the end
beg = end;
// go to new beginning to start reading the next block
fs.seekg(beg);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With