PEP 8 & Linting: Keeping Your Code Clean
Code is read much more often than it is written. In a data science team, readability is paramount. If your colleague (or your future self) cannot read your code, the analysis is useless.
PEP 8: The Style Guide
python has an official style guide called PEP 8. It dictates how your code should look.
- Use 4 spaces for indentation.
- Snake_case for functions and variables (
my_function, notmyFunction). - CamelCase for classes (
MyClass). - Spaces around operators (
x = 1, notx=1).
You don't need to memorize the whole document. Tools can help you.
Linters: The Automated Code Reviewers
A linter analyzes your code for programmatic and stylistic errors before you run it.
Common Linters
- Flake8: Checks for PEP 8 compliance and logic errors (like unused imports or undefined variables).
- Pylint: very strict, catches more errors but can be noisy.
- Ruff: The new kid on the block. Written in Rust, it is incredibly fast and combines features of Flake8, isort, and others.
Example of what a linter catches:
import pandas as pd # Linter: imported but unused
x=1 # Linter: missing whitespace around operator
def MyFunc(): # Linter: function name should be lowercase
print(x)
Formatters: The Auto-Fixers
While linters tell you what's wrong, formatters fix it for you automatically.
Black: The Uncompromising Code Formatter
Black is the industry standard. It formats your code so you don't have to decide. It reshapes your code to be PEP 8 compliant.
Before Black:
def very_long_function_name(parameter_one, parameter_two, parameter_three, parameter_four): print(parameter_one)
After Black:
def very_long_function_name(
parameter_one, parameter_two, parameter_three, parameter_four
):
print(parameter_one)
How to Set This Up
-
Install the tools:
pip install ruff black -
Run them:
# Check for errors ruff check . # Format code black . -
VS Code Integration: Install the "Ruff" and "Black Formatter" extensions in VS Code. Enable "Format on Save" in your settings.
Summary
- Style Guide: Follow PEP 8.
- Linter (Ruff/Flake8): Finds bugs and style violations.
- Formatter (Black): Automatically fixes style violations.
Adopting these tools removes the mental load of formatting, letting you focus on the logic and the data.