Documentation and Docstrings: Be Kind to Your Future Self
We've all been there: opening a project we worked on six months ago and having absolutely no idea what x = y * 0.5 means. Documentation isn't just a chore; it's a love letter to your future self and your teammates.
The Hierarchy of Documentation
- Self-Documenting Code: Variable and function names so clear they don't need explanation.
- Docstrings: Structured explanations of what a function/class does, its inputs, and outputs.
- Comments: Explanations of why something is done (avoid stating what is done, the code shows that).
- External Docs (README/Wiki): High-level overview, installation, and usage guides.
Docstrings: The Standard
In Python, a docstring is a string literal that occurs as the first statement in a module, function, class, or method definition.
Bad:
def split(df, r):
# Splits data
n = int(len(df) * r)
return df[:n], df[n:]
Good (Google Style):
def split_dataset(data: pd.DataFrame, ratio: float) -> tuple[pd.DataFrame, pd.DataFrame]:
"""Splits the dataset into training and testing sets.
Args:
data: The input dataframe containing features and labels.
ratio: A float between 0 and 1 indicating the proportion of data
to include in the training set.
Returns:
A tuple containing (train_df, test_df).
Raises:
ValueError: If ratio is not between 0 and 1.
"""
if not 0 < ratio < 1:
raise ValueError("Ratio must be between 0 and 1.")
n = int(len(data) * ratio)
return data[:n], data[n:]
Why standard formats matter?
Tools like Sphinx or VS Code extensions can automatically generate nice HTML documentation or hover-text tooltips from standard docstrings (Google, NumPy, or reST style).
Comments: The Why, Not the What
Useless Comment:
x = x + 1 # Increment x
Useful Comment:
# We add a small epsilon to avoid division by zero errors in the log transform
x = x + 1e-9
Summary
- Name well:
learning_rateis better thanlr. - Docstring always: Every function needs a docstring explaining Args and Returns.
- Comment sparingly: Only explain complex logic or "why" decisions.
Good documentation turns a "script" into a "product."