Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I get a specific byte or bytes from a streambuf?

For receiving a raw protocol with custom headers Ethernet frame , I am reading in the bytes from Ethernet using a streambuf buffer. The payload gets copied successfully for the most part, but I need to check a specific byte of the frame header in the buffer so I can handle certain corner cases, but unable to figure out how to get the specific byte, and how to get it into an integer. Here is the code:

boost::asio::streambuf read_buffer;

boost::asio::streambuf::mutable_buffers_type buf = read_buffer.prepare(bytesToGet);
bytesRead = d_socket10->receive(boost::asio::buffer(buf, bytesToGet));
read_buffer.commit(bytesRead);

const char *readData = boost::asio::buffer_cast<const char*>( read_buffer.data() + 32 ); 

I need to get the length byte that would be at address 20. I've tried doing stuff with stringstream, memcpy and casting, but I don't have a handle on that, either getting compile errors or its not doing what I thought it should do.

How can I get the byte from the offset I need and cast it to a byte or short? The size is actually 2 bytes, but in this specific case, one of those bytes should be zero, so either getting 1 byte or 2 bytes would be ideal.

Thanks!

like image 782
J. Doe Avatar asked Oct 20 '25 18:10

J. Doe


1 Answers

Welcome to parsing.

Welcome to binary data.

Welcome to portable network protocols.

Each of these three subjects are their own thing to get a handle on.

The simplest thing would be to read into a buffer and use that. Use Boost Endian to remove portability concerns.

Here's the simplest thing I can think of using just standard library things (ignoring endianness):

Live On Coliru

#include <boost/asio.hpp>
#include <istream>
#include <iostream>

namespace ba = boost::asio;

void fill_testdata(ba::streambuf&);

int main() {
    ba::streambuf sb;
    fill_testdata(sb);

    // parsing starts here
    char buf[1024];
    std::istream is(&sb);
    // read first including bytes 20..21:
    is.read(buf, 22);
    size_t actual = is.gcount();

    std::cout << "stream ok? " << std::boolalpha << is.good() << "\n";
    std::cout << "actual: " << actual << "\n";
    if (is && actual >= 22) { // stream ok, and not a short read
        uint16_t length = *reinterpret_cast<uint16_t const*>(buf + 20);
        std::cout << "length: " << length << "\n";

        std::string payload(length, '\0');
        is.read(&payload[0], length);
        actual = is.gcount();

        std::cout << "actual payload bytes: " << actual << "\n";
        std::cout << "stream ok? " << std::boolalpha << is.good() << "\n";
        payload.resize(actual);

        std::cout << "payload: '" << payload << "'\n";
    }
}

// some testdata
void fill_testdata(ba::streambuf& sb) 
{
    char data[] = { 
        '\x00', '\x00', '\x00', '\x00', '\x00', // 0..4
        '\x00', '\x00', '\x00', '\x00', '\x00', // 5..9
        '\x00', '\x00', '\x00', '\x00', '\x00', // 10..14
        '\x00', '\x00', '\x00', '\x00', '\x00', // 15..19
        '\x0b', '\x00', 'H'   , 'e'   , 'l'   , // 20..24
        'l'   , 'o'   , ' '   , 'w'   , 'o'   , // 25..29
        'r'   , 'l'   , 'd'   , '!'   ,         // 30..33
    };
    std::ostream(&sb).write(data, sizeof(data));
}

Prints

stream ok? true
actual: 22
length: 11
actual payload bytes: 11
stream ok? true
payload: 'Hello world'

Increase \x0b to \x0c to get:

stream ok? true
actual: 22
length: 12
actual payload bytes: 12
stream ok? true
payload: 'Hello world!'

Increasing it to more than is in the buffer, like '\x0d gives a failed (partial) read:

stream ok? true
actual: 22
length: 13
actual payload bytes: 12
stream ok? false
payload: 'Hello world!'

Let's Go Pro

To go pro, I'd use a library like e.g. Boost Spirit. This understands about endianness, does validations and really shines when you get branches in your parser, like

 record = compressed_record | uncompressed_record;

Or

 exif_tags = .... >> custom_attrs;

 custom_attr  = attr_key >> attr_value;
 custom_attrs = repeat(_ca_count) [ custom_attrs ];

 attr_key = bson_string(64);     // max 64, for security
 attr_value = bson_string(1024); // max 1024, for security

 bson_string %= omit[little_dword[_a=_1]] 
             >> eps(_a<=_r) // not exceeding maximum
             >> repeat(_a) [byte_];

But that's noodling far ahead. Let's do a much simpler demo:

Live On Coliru ¹

#include <boost/asio.hpp>

#include <istream>
#include <iostream>

namespace ba = boost::asio;

void fill_testdata(ba::streambuf&);

struct FormatData {
    std::string signature, header; // e.g. 4 + 16 = 20 bytes - could be different, of course
    std::string payload;           // 16bit length prefixed
};

FormatData parse(std::istream& is);

int main() {
    ba::streambuf sb;
    fill_testdata(sb);

    try {
        std::istream is(&sb);
        FormatData data = parse(is);

        std::cout << "actual payload bytes: " << data.payload.length() << "\n";
        std::cout << "payload: '" << data.payload << "'\n";
    } catch(std::runtime_error const& e) {
        std::cout << "Error: " << e.what() << "\n";
    }
}

// some testdata
void fill_testdata(ba::streambuf& sb) 
{
    char data[] = { 
        'S'   , 'I'   , 'G'   , 'N'   , '\x00'   , // 0..4
        '\x00', '\x00', '\x00', '\x00', '\x00'   , // 5..9
        '\x00', '\x00', '\x00', '\x00', '\x00'   , // 10..14
        '\x00', '\x00', '\x00', '\x00', '\x00'   , // 15..19
        '\x0b', '\x00', 'H'   , 'e'   , 'l'      , // 20..24
        'l'   , 'o'   , ' '   , 'w'   , 'o'      , // 25..29
        'r'   , 'l'   , 'd'   , '!'   , // 30..33
    };
    std::ostream(&sb).write(data, sizeof(data));
}

//#define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi = boost::spirit::qi;

BOOST_FUSION_ADAPT_STRUCT(FormatData, signature, header, payload)

template <typename It>
struct FileFormat : qi::grammar<It, FormatData()> {
    FileFormat() : FileFormat::base_type(start) {
        using namespace qi;

        signature  = string("SIGN");     // 4 byte signature, just for example
        header     = repeat(16) [byte_]; // 16 byte header, same

        payload   %= omit[little_word[_len=_1]] >> repeat(_len) [byte_];
        start      = signature >> header >> payload;

        //BOOST_SPIRIT_DEBUG_NODES((start)(signature)(header)(payload))
    }
  private:
    qi::rule<It, FormatData()> start;
    qi::rule<It, std::string()> signature, header;

    qi::_a_type _len;
    qi::rule<It, std::string(), qi::locals<uint16_t> > payload;
};

FormatData parse(std::istream& is) {
    using it = boost::spirit::istream_iterator;

    FormatData data;
    it f(is >> std::noskipws), l;
    bool ok = parse(f, l, FileFormat<it>{}, data);

    if (!ok)
        throw std::runtime_error("parse failure\n");

    return data;
}

Prints:

actual payload bytes: 11
payload: 'Hello world'

¹ What a time to be alive! Coliru swamped and wandbox down, simultaneously! Had to remove Boost Asio for the online demo because IdeOne doesn't link Boost System

like image 150
sehe Avatar answered Oct 23 '25 08:10

sehe



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!