Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is a newline 2 bytes in Windows?

What is the reason for newline being 2 bytes on Windows? Isn't \n just one byte in ASCII?

like image 495
talloaktrees Avatar asked Sep 13 '25 06:09

talloaktrees


1 Answers

Historically a line break consisted of two characters: U+000D Carriage Return (I'm using Unicode here because that's what we use nowadays – back then it would have been ASCII or probably not even that) and U+000A New Line. Those two were necessary because one would advance the print head one line further while the other would return it to the start of the next line. Compare that to turning the paper roll on a typewriter (a teletypewriter is nothing else, actually, just connected to a computer) and moving the carriage back which the lever on the right does both for you.

Most network protocols retain the CR+LF sequence, by the way, so in a way it's Unix that's the oddball here. By the time teletypes went out of existence and got replaced by video terminals and later by terminal emulators there was not physical need anymore for the two-character sequence. Also it makes checking for a line break hard in code because you always have to compare two bytes. Thus the decision was made for (Multics and later) Unix to just keep one character which would simplify many things. C was later specified to perform conversion between U+000A and the platform-native line break sequence when reading or writing streams in text mode.

Windows on the other hand inherited CR+LF via CP/M and DOS and there's no useful reason why they should change that default. Backwards compatibility was always a strong point for Microsoft and they couldn't just have broken that at some point in the past (it would have made for some very angry customers, I bet).

Mac OS (the old one) was another oddball, using just CR for a line break.

like image 57
Joey Avatar answered Sep 16 '25 07:09

Joey