Normally, to indicate EOF to a program attached to standard input on a Linux terminal, I need to press Ctrl+D once if I just pressed Enter, or twice otherwise. I noticed that the patch command is different, though. With it, I need to press Ctrl+D twice if I just pressed Enter, or three times otherwise. (Doing cat | patch instead doesn't have this oddity. Also, If I press Ctrl+D before typing any real input at all, it doesn't have this oddity.) Digging into patch's source code, I traced this back to the way it loops on fread. Here's a minimal program that does the same thing:
#include <stdio.h>
int main(void) {
    char buf[4096];
    size_t charsread;
    while((charsread = fread(buf, 1, sizeof(buf), stdin)) != 0) {
        printf("Read %zu bytes. EOF: %d. Error: %d.\n", charsread, feof(stdin), ferror(stdin));
    }
    printf("Read zero bytes. EOF: %d. Error: %d. Exiting.\n", feof(stdin), ferror(stdin));
    return 0;
}
When compiling and running the above program exactly as-is, here's a timeline of events:
fread.fread calls the read system call.read system call returns 5.fread calls the read system call again.read system call returns 0.fread returns 5.Read 5 bytes. EOF: 1. Error: 0.
fread again.fread calls the read system call.read system call returns 0.fread returns 0.Read zero bytes. EOF: 1. Error: 0. Exiting.
Why does this means of reading stdin have this behavior, unlike the way that every other program seems to read it? Is this a bug in patch? How should this kind of loop be written to avoid this behavior?
UPDATE: This seems to be related to libc. I originally experienced it on glibc 2.23-0ubuntu3 from Ubuntu 16.04. @Barmar noted in the comments that it doesn't happen on macOS. After hearing this, I tried compiling the same program against musl 1.1.9-1, also from Ubuntu 16.04, and it didn't have this problem. On musl, the sequence of events has steps 12 through 14 removed, which is why it doesn't have the problem, but is otherwise the same (except for the irrelevant detail of readv in place of read).
Now, the question becomes: is glibc wrong in its behavior, or is patch wrong in assuming that its libc won't have this behavior?
the “end-of-file” (EOF) key combination can be used to quickly log out of any terminal. CTRL-D is also used in programs such as “at” to signal that you have finished typing your commands (the EOF command). key combination is used to stop a process. It can be used to put something in the background temporarily.
Ctrl+D in the Linux shell In the Linux command-line shell, pressing Ctrl + D logs out of the interface. If you used the sudo command to execute commands as another user, pressing Ctrl + D exits out of that other user and puts you back as the user you originally logged into.
I've managed to confirm that this is due to an unambiguous bug in glibc versions prior to 2.28 (commit 2cc7bad). Relevant quotes from the C standard:
The byte input/output functions — those functions described in this subclause that perform input/output: [...],
freadThe byte input functions read characters from the stream as if by successive calls to the
fgetcfunction.If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end-of-file indicator for the stream is set and the
fgetcfunction returnsEOF. Otherwise, thefgetcfunction returns the next character from the input stream pointed to bystream.
(emphasis on "or" mine)
The following program demonstrates the bug with fgetc:
#include <stdio.h>
int main(void) {
    while(fgetc(stdin) != EOF) {
        puts("Read and discarded a character from stdin");
    }
    puts("fgetc(stdin) returned EOF");
    if(!feof(stdin)) {
        /* Included only for completeness. Doesn't occur in my testing. */
        puts("Standard violation! After fgetc returned EOF, the end-of-file indicator wasn't set");
        return 1;
    }
    if(fgetc(stdin) != EOF) {
        /* This happens with glibc in my testing. */
        puts("Standard violation! When fgetc was called with the end-of-file indicator set, it didn't return EOF");
        return 1;
    }
    /* This happens with musl in my testing. */
    puts("No standard violation detected");
    return 0;
}
To demonstrate the bug:
The exact bug is that if the end-of-file stream indicator is set, but the stream is not at end-of-file, glibc's fgetc will return the next character from the stream, rather than EOF as the standard requires.
Since fread is defined in terms of fgetc, this is the cause of what I originally saw. It's previously been reported as glibc bug #1190 and has been fixed since commit 2cc7bad in February 2018, which landed in glibc 2.28 in August 2018.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With