Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I split a stream on either carriage return (\r) or CRLF (\r\n) line terminators?

I'm trying to split an odd serial port stream that separates lines with carriage-return \r and sometimes \r\n. BufReader has the lines function, but it only splits on \n or \r\n. There is a .read_until(...) function, but it only works for a single terminator.

Based on the standard library's implementation, I've started to cobble together some bits, but I haven't gotten it to compile yet. I hope I'm doing this right the "Rust way". Regular expressions seem too expensive for a byte stream.

Example input:

Heading:\r\nLine 1\rLine 2\rLine 3\r\nEnd

When you use lines() on that input, you get three lines because \r is not considered a line terminator:

Heading:
Line 1\rLine 2\rLine 2\rLine 3
End
like image 958
Petrus Theron Avatar asked Oct 23 '25 09:10

Petrus Theron


1 Answers

Based on my previous answer on github to match your need:

use std::io::{BufRead, BufReader};
use std::str;

#[derive(Debug)]
pub struct MyLines<B> {
    buffer: B,
}

#[derive(Debug)]
pub enum MyError {
    Io(std::io::Error),
    Utf8(std::str::Utf8Error),
}

impl<B> MyLines<B> {
    pub fn new(buffer: B) -> Self {
        Self { buffer }
    }
}

impl<B: BufRead> Iterator for MyLines<B> {
    type Item = Result<String, MyError>;

    fn next(&mut self) -> Option<Self::Item> {
        let (line, total) = {
            let buffer = match self.buffer.fill_buf() {
                Ok(buffer) => buffer,
                Err(e) => return Some(Err(MyError::Io(e))),
            };
            if buffer.is_empty() {
                return None;
            }
            let consumed = buffer
                .iter()
                .take_while(|c| **c != b'\n' && **c != b'\r')
                .count();
            let total = consumed
                + if consumed < buffer.len() {
                    // we found a delimiter
                    if consumed + 1 < buffer.len() // we look if we found two delimiter
                    && buffer[consumed] == b'\r'
                    && buffer[consumed + 1] == b'\n'
                    {
                        2
                    } else {
                        1
                    }
                } else {
                    0
                };
            let line = match str::from_utf8(&buffer[..consumed]) {
                Ok(line) => line.to_string(),
                Err(e) => return Some(Err(MyError::Utf8(e))),
            };
            (line, total)
        };
        self.buffer.consume(total);

        Some(Ok(line))
    }
}

fn main() {
    let f = BufReader::new("Heading:\r\nLine 1\rLine 2\rLine 3\r\nEnd".as_bytes());

    for line in MyLines::new(f) {
        println!("{:?}", line);
    }
}

Output:

Ok("Heading:")
Ok("Line 1")
Ok("Line 2")
Ok("Line 3")
Ok("End")
like image 178
Stargateur Avatar answered Oct 26 '25 03:10

Stargateur



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!