Skip to main content
  1. Posts/

print message not shown in nohup.out?

··686 words·4 mins·
Python Linux
Table of Contents

When starting a long-running Python program, I often run it under nohup so I can redirect all the output to nohup.out for easier inspection. I use nohup python test.py& to run the process in the background. Then I can run tail -f nohup.out to monitor the output message from this program.

One strange issue I met is that all messages from the print() function are missing. To reproduce this issue, run the following test.py with command nohup python test.py&:

import time

n = 20
for i in range(n):
    time.sleep(0.5)
    print(f"this is loop {i}")

Then use tail -f nohup.out to check the output. Notice that there is no output for some time. Then, all of a sudden, all the output is printed when the program exits.

However, if we directly run this program (python test.py), to our expectations, the messages are indeed shown one at a time.

Different buffering behaviors of sys.stdout and sys.stderr
#

This has something to do with the stream buffering behavior of Python. print() function in Python will print a message to the sys.stdout stream. When we print directly to the console, in this case, the stdout stream is used interactively, it will be line-buffered, i.e., the message will be shown in the terminal once a newline is met.

If we use nohup, stdout is redirected to a file (nohup.out), it will be block-buffered in this case, i.e., only when the size of the output reaches a certain limit, will they be put into the destination (nohup.out file). So we can not see each print message in a timely manner. When the program exits, all the output is flushed to nohup.out, that is when we see those messages.

This behavior of sys.stdout and sys.stderr is documented here:

When interactive, stdout and stderr streams are line-buffered. Otherwise, they are block-buffered like regular text files.

This explains why the messages is not shown timely if we redirect stdout to nohup.out, since stdout is block buffered in this case. Block-buffered means that Python stores the output message in a buffer, and only when the buffer reaches a certain size and can not accommodate the incoming message, the message in this buffer will be flushed and reach its destination.

If we use sufficiently large message, it will also be shown immediately since the output buffer is full. It seems that buffer size for print is 81921. To verify this, we modify test.py a little bit:

import time

n = 20
size = 2048
sleep_time = 1

for i in range(n):
    time.sleep(sleep_time)
    print(f"{i}" + "a" * size)

If we run the above script with nohup python test.py & and then observe the output via tail -f nohub.out, we will find that it will print 4 lines at a time, i.e., lines starting with 0, 1, 2, and 3 will be printed out first, then lines starting with 4, 5, 6, and 7, etc.

For stderr, users may want to see the error messages immediately when they are produced even if it is redirected. In Python 3.9, the behavior of stderr is changed. The new doc is:

When interactive, the stdout stream is line-buffered. Otherwise, it is block-buffered like regular text files. The stderr stream is line-buffered in both cases.

Thus, after Python 3.9, stderr will always be line-buffered, no matter how the error message is displayed, i.e., on the terminal or redirected to a file.

Show the message without buffering
#

To show the printed message immediately, we may run Python in unbuffered mode with the -u flag:

-u : force the stdout and stderr streams to be unbuffered

nohup python -u test.py

Another way is to add flush=True to print() function so that output will be flushed to its destination forcibly.

print(f"this is loop {i}", flush=True)

References
#

Related

Retry for Google Cloud Client
·197 words·1 min
Python GCP
Make Python logging Work in GCP
·570 words·3 mins
Python Logging GCP
Speed up document indexing in Elasticsearch via bulk indexing
·355 words·2 mins
Python Elasticsearch