In the past, I often use plain print() to display some statistics during the training process and validation process. But it is not convenient to save the statistics in a file for later inspection. Once the program is finished, we do not have an overview of the training process. I am aware that there are packages such as tensorboardX and visdom, which are specifically designed for inspecting the various statistics during running. Right now, I would rather keep the program plain simple and usable. So I decided to give Python logging pacakge a try.

If you want to show messages on your terminal output and save to a file, you can use the following setting:

import logging

logging.basicConfig(
    level=logging.INFO,
    datefmt="%Y-%m-%d %H:%M:%S",
    format="%(asctime)s %(message)s",
    handlers=[
        logging.FileHandler("{}/{}".format(log_path, log_name)),
        logging.StreamHandler()
    ]
)

The datefmt parameter is used to format the time string in the logging message. With the above format, the time is shown as something like the following:

2018-10-15 09:44:49 (…other message…)

The time format string use the standard Python time format. You can find the full list of format directives here.

The format paramter is used to set the format of the output string, i.e., what to show in the logging message. we can show time, current file name, process name etc. A list of predefined attribute are (excerpted from the logging documentation):

%(name)s            Name of the logger (logging channel)
%(levelno)s         Numeric logging level for the message (DEBUG, INFO,
                    WARNING, ERROR, CRITICAL)
%(levelname)s       Text logging level for the message ("DEBUG", "INFO",
                    "WARNING", "ERROR", "CRITICAL")
%(pathname)s        Full pathname of the source file where the logging
                    call was issued (if available)
%(filename)s        Filename portion of pathname
%(module)s          Module (name portion of filename)
%(lineno)d          Source line number where the logging call was issued
                    (if available)
%(funcName)s        Function name
%(created)f         Time when the LogRecord was created (time.time()
                    return value)
%(asctime)s         Textual time when the LogRecord was created
%(msecs)d           Millisecond portion of the creation time
%(relativeCreated)d Time in milliseconds when the LogRecord was created,
                    relative to the time the logging module was loaded
                    (typically at application startup time)
%(thread)d          Thread ID (if available)
%(threadName)s      Thread name (if available)
%(process)d         Process ID (if available)
%(message)s         The result of record.getMessage(), computed just as
                    the record is emitted

The handler parameter is used to set up where logging messages go. In the above example, we use a FileHandler to save the logging message in the disk file. We use Streamhandler to display the messages in the terminal.

After configuring logging, you need a logger:

logger = logging.getLogger(__name__)

In your program, if you need to log something, use logger.info(). A example usage is shown below:

logger.info("Epoch %d, Accuracy: %f, Loss: %f", epoch, accuracy, loss)

The logging module relies on the former Python string formatting style. If you use the new .format() style string formatting. You will see a pylint warning complaining:

pylint: logging-not-lazy / Specify string format arguments as logging function parameters

The above formatting is also slightly different from the old Python formatting style. You do not need to write % between strings and the values. You can just append a list of needed values as the argument for the info method.

References