Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does read() in C read input in chunks when there's no user-level buffering?

Tags:

c

linux

io

posix

I'm learning how the POSIX read() function works in C. I understand that read() is considered unbuffered, meaning it doesn't manage any internal buffer like fread() does. But I'm confused by the following behavior.

I ran this simple program:

#include <stdio.h>
#include <unistd.h>

int main() {
    char buffer[BUFSIZ];
    int fd = STDIN_FILENO;

    int n = read(fd, buffer, 5);  // Read first 5 bytes from stdin
    buffer[n] = '\0';
    puts(buffer);

    n = read(fd, buffer, BUFSIZ);    // Read the rest
    buffer[n] = '\0';
    puts(buffer);

    return 0;
}

When I type the input: hello12345

The output is: hello 12345

So even though read() is unbuffered, it seems to get its data from somewhere that already contains all the input. This suggests there is some kind of input buffer where "hello12345" was stored before the two read() calls fetched from it in chunks.

My question is:

If read() is unbuffered, how does it still know where to continue reading from?

Is the terminal input buffered by the OS or kernel before read() accesses it?

Where exactly does read() get the data from when reading from stdin?

I'd appreciate a detailed explanation of what role the OS or terminal plays in this behavior.

like image 667
Dahy Allam Avatar asked Oct 25 '25 14:10

Dahy Allam


2 Answers

I understand that read() is considered unbuffered, meaning it doesn't manage any internal buffer like fread() does.

That's incorrect. It doesn't have a buffer in user memory like stdio does, but most device drivers have their own buffers.

In particular, the tty driver used for terminals has multiple buffers. While you're typing there's a temporary buffer used for editing with the Backspace key. Once you press Enter the input is put into a longer-lived input buffer. read() extracts from this input buffer.

like image 141
Barmar Avatar answered Oct 27 '25 03:10

Barmar


In addition to buffer issues addressed in elsewhere:

Take care with buffer overflow

The below risks accessing outside buffer[].

n = read(fd, buffer, BUFSIZ);
buffer[n] = '\0';

Yes, BUFSIZ is at least 256, yet if BUFSIZ characters were read or an error occurred, then buffer[n] = '\0' would attempt to zero a location outside buffer[].

Better as

n = read(fd, buffer, sizeof buffer - 1);
if (n >= 0) {
  buffer[n] = '\0';
}
like image 37
chux Avatar answered Oct 27 '25 02:10

chux



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!