Every programmer knows newline character, but may be not so familiar. In this post, I want to share what I have learned about newline handling in various cases.
Newline characters on different platforms
Due to historical reasons, different platforms use different characters to
signify a new line. On Windows,
<CR><LF> (byte code
0x0D0x0A) is used to
represent newline. On Linux,
0x0A) is used to represent
newline. On older Mac1,
0x0D) is used.
<LF> date back to the old time when typewriters is used for
printing texts on paper.
<CR> represents carriage return, which means to
put the carriage to its left-most position.
<LF> represents line feed,
which means to move the paper a little higher so that you can type on a new
line. You can see that these two actions combined will start a new line ready
Newline handling in Python
Python 2 and Python 3 have different way of handling newlines. In Python 2,
there is a universal newline
means that no matter what the file line ending is, it will all be translated to
\n when reading files with mode specifier
In Python 3, things have changed. The old
U mode specifier has been
deprecated in favor of a
newline parameter in the
open() method. According
to the documentation:
newline controls how universal newlines mode works (it only applies to text mode). It can be None, ‘’, ‘\n’, ‘\r’, and ‘\r\n’. It works as follows:
- When reading input from the stream, if newline is None, universal newlines mode is enabled. Lines in the input can end in ‘\n’, ‘\r’, or ‘\r\n’, and these are translated into ‘\n’ before being returned to the caller. If it is ‘’, universal newlines mode is enabled, but line endings are returned to the caller untranslated.
- When writing output to the stream, if newline is None, any ‘\n’ characters written are translated to the system default line separator,
os.linesep. If newline is ’’ or ‘\n’, no translation takes place.
when reading text files,
None by default, which means that
system-dependent newline will be quietly replaced by
\n. If you are not aware
of this behavior, you may get into trouble. For example, when you read a file
\r\n line ending and want to split the text into lines on Windows
platform, if you use the following snippet:
with open("some_file.txt", "r") as f: text = f.read() lines = text.split(os.linesep)
you will not be able to split the text into lines. That is because on Windows
\r\n. But Python has secretly translated the
in the file to
When writing files, you should also be aware that
\n will be translated to
platform-dependent line endings.
Newline handling in different editors
When reading a file into the buffer, Vim will automatically detect the file format2. Then Vim will replace the platform-dependent newline characters with a special mark to mark the end of each line. When writing the buffer content back into the file, Vim will write the actual newline characters based on the detected file format.
For example, if you open a file with Windows-style line ending, Vim will
<CR><LF> with its own newline mark. If you try to search these
two characters using their byte code (
), you will find nothing. Neither can you find
<CR> characters using
Windows file in Vim (suppose
dos). When searching in
\n is used to specify end of line, no matter what the actual newline
character is for this file. So you can search the line end with
How do I show the
<CR> characters in Vim then?
You can open a Windows file in Vim and use
e ++ff=unix3 to force Vim to
treat this file as a unix file. Vim will then treat the
\n characters as
newline, thus removing it from the buffer. But the
\r characters in the file
will now be treated as normal characters and will be shown as
^M. You will
see it now.
You can also press
<Ctrl-V> and then press
<Enter> to type a carriage
return character. Then you can search this character using
A pitfall in searching and replacing newlines
\n is used to represent newline only when you are searching it. If
you want to represent a newline in replacement, use
\r instead4. This
makes no sense, but that is how Vim works.
According to discussions
Sublime Text will also convert platform-dependent newline to
\n in memory.
When writing to files, it will write newlines based on the detected file type
(Windows, Unix or Mac).
Notepad++ is also a popular code editor. It
can detect your line endings, but it will not replace the newline with
show the newline characters in a file, go to
View --> Show Symbol and toggle
Show End of Line, you will be able to see the newline characters.
Conversion between different file formats?
In Vim, you can use
set ff=<Format> to covert the current file to desired
<Format> can be
In Sublime Text, just choose the desired format from the bottom right status bar.
In Notepad++, go to
Edit --> EOL Conversion and choose the desired file
There are also tools such as
unix2dos which convert
between different file formats.
- Newline handling in Python3.
- Discussions of newline characters on StackOverflow.
- Different parts of a typewriters explained.
- See line breaks and carriage returns in editor.
- How to get rid of ^M.
- Why does Vim sometimes shows ^M.
Title image is taken from here.
In vim, use
:h file-formatsfor more info about how Vim detects and file format and reads files. ↩︎
:h ++ffto find more information about what this command means. ↩︎
License CC BY-NC-ND 4.0