Logging vs. Printing
When you're hacking away in a Jupyter Notebook, print() is your best friend. But when you move to production scripts, pipelines, or APIs, print() becomes a liability.
The Limitation of Print
- No Context:
print("Error")tells you nothing about when it happened or where. - No Levels: You can't easily turn off "debug" prints while keeping "error" prints.
- No Destinations:
printonly goes to standard output (console). You can't easily send it to a file, Slack, or a monitoring tool.
Enter the logging Module
Python's built-in logging module solves all these problems.
Basic Usage
import logging
# Configure the logger
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s'
)
logging.debug("This is for detailed debugging info") # Won't show if level is INFO
logging.info("Training started")
logging.warning("Disk space low")
logging.error("Database connection failed")
logging.critical("System is down!")
Output:
2026-01-05 10:00:00,123 - INFO - Training started
2026-01-05 10:00:05,456 - WARNING - Disk space low
Why it's better for Data Science
1. Control Verbosity
You can leave logging.debug("Shape: " + str(df.shape)) in your code. When running in production, set the level to WARNING, and your logs stay clean. When debugging a crash, specific the level to DEBUG via an environment variable, and see all the details without changing code.
2. File Logging
You can easily write logs to a file to audit long-running training jobs.
logging.basicConfig(filename='training.log', level=logging.INFO)
Now you can close your terminal, come back tomorrow, and check training.log to see exactly what happened step-by-step.
3. Structured Logging
For advanced use cases, you can output logs in JSON format, which can be ingested by tools like Datadog or Splunk for fancy dashboards.
Best Practices
- Use
__name__: Instantiate loggers withlogger = logging.getLogger(__name__). This allows you to filter logs by module. - Don't print exceptions: Use
logger.exception("Message")insideexceptblocks to automatically include the full stack trace.
Stop printing. Start logging.