Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way in c++ to read contents of stdin into a string or vector

A number of stackoverflow answers deal with how to slurp in files from disk where you can preallocate memory based on the file size.

  • What is the best way to read an entire file into a std::string in C++?
  • Fastest way possible to read contents of a file

But what is the fastest way to slurp in stdin (e.g. when a large file is piped into your program)?

I am happy to slurp into a vector (which can always be converted into a std::string later) if that is the fastest solution.

like image 656
Leo Goodstadt Avatar asked Oct 19 '25 23:10

Leo Goodstadt


2 Answers

The fastest way to read unformatted data into memory is to use unformatted read routines. fstream::read(), for example. Nothing will beat it.

BEWARE! Some folks claim you will see performance improvement by using OS-level routines, like read(). You will get a tremendouse performance degradation if you try this.

EDIT. Some explanation of the above statement. The reason for the degradation would be the kernel calls. Every read is a kernel call, so unless you read exactly in sizes of optimal data buffer, you will incure more calls to kernel or less optimal reads. While you can experimentally figure out the best read size, the C runtime has already done this for you. fread() and unformatted stream read has already been optimized, so no matter how big your reading chunks are, you are guranteed to call kernel in the most optimal way.

like image 62
SergeyA Avatar answered Oct 21 '25 14:10

SergeyA


Reading into a fixed-sized buffer in a loop

To my surprise, old-fashioned, almost c-like code seems to be the fastest with both clang and gcc:

{
    vector<char> cin_str;
    // 64k buffer seems sufficient
    std::streamsize buffer_sz = 65536;
    vector<char> buffer(buffer_sz);
    cin_str.reserve(buffer_sz);

    auto rdbuf = cin.rdbuf();
    while (auto cnt_char = rdbuf->sgetn(buffer.data(), buffer_sz))
        cin_str.insert(cin_str.end(), buffer.data(), buffer.data() + cnt_char);
}

Using istream::read() and istream::gcount() was as fast but required a little extra code...

c++ iterators

Surprisingly, using istreambuf_iterator (iterator for unformatted input) turned out to be much, much slower: >3x for some test files, even after switching off sync with stdio.

{
    std::ios_base::sync_with_stdio(false) ;
    vector<char> cin_str;
    //              64k
    std::streamsize buffer_sz = 65536;
    cin_str.reserve(buffer_sz);

    std::istreambuf_iterator<char> iit (std::cin.rdbuf()); // stdin iterator
    std::istreambuf_iterator<char> eos;                    // end-of-range iterator
    std::copy(iit, eos, std::back_inserter(cin_str));
    return cin_str;
}

This is true even after reserving space for the vector buffer (rather than just assigning to it).

The other surprise is that a see (near) maximum speed even with a very modest buffer size (64 kb). vector just has a very efficient reallocation strategy.

Addendum:

Google-ing finds this blog post

(http://insanecoding.blogspot.in/2011/11/reading-in-entire-file-at-once-in-c.html) from 2011 which seems to show that this approach is about as fast as you can go in c++ (in gcc/clang), and switching to cstdio does not provide further gains (but obviously makes the code even uglier!).

Avoiding copies

@BenVoigt points out that the read data can be placed in place by sgetn() / istream::read() if we judiciously preallocate the requisite space:

{
    std::ios_base::sync_with_stdio(false) ;
    //              64k
    std::streamsize buffer_sz = 65536;
    vector<char> cin_str(buffer_sz);
    std::streamsize cin_str_data_end = 0U;

    auto rdbuf = cin.rdbuf();
    while (auto cnt_char = rdbuf->sgetn(cin_str_data_end + cin_str.data(), buffer_sz))
    {
        cin_str_data_end += cnt_char;
        cin_str.resize(cin_str_data_end + buffer_sz);
    }
    cin_str.resize(cin_str_data_end);
    return cin_str;
}

In testing, this resulted in no further speedups probably because this code is dominated by 1) i/o 2) system call overhead 3) vector memory allocation

Is there a faster way to do this? Memory mapped files from boost?

like image 33
Leo Goodstadt Avatar answered Oct 21 '25 15:10

Leo Goodstadt