Design Patterns: Factory and Strategy
Design patterns are standard solutions to common problems in software design. For data scientists, two patterns are particularly useful for managing experiments: Factory and Strategy.
The Problem: The "If-Else" Hell
We've all written code like this:
if model_name == 'rf':
model = RandomForest()
elif model_name == 'xgb':
model = XGBoost()
elif model_name == 'svm':
model = SVC()
# ... 10 more lines ...
This is hard to maintain. Adding a new model requires editing the main training logic.
1. The Strategy Pattern
The Strategy pattern defines a family of algorithms, encapsulates each one, and makes them interchangeable.
In Python, since functions are first-class citizens, this is easy. You define a common interface (e.g., fit and predict). Since most Scikit-Learn estimators already follow this, they are strategies!
2. The Factory Pattern
The Factory pattern creates objects without specifying the exact class of object that will be created.
# model_factory.py
def get_model(config):
"""Factory function to instantiate models."""
model_type = config.get('type')
params = config.get('params', {})
if model_type == 'random_forest':
return RandomForestClassifier(**params)
elif model_type == 'xgboost':
return XGBClassifier(**params)
else:
raise ValueError(f"Unknown model type: {model_type}")
Putting it Together
Now your main script is clean and agnostic to the specific model details.
# main.py
import yaml
from model_factory import get_model
# Load config (Configuration Management!)
config = yaml.safe_load(open("config.yaml"))
# Get Strategy from Factory
model = get_model(config['model'])
# Execute Strategy (Polymorphism)
model.fit(X_train, y_train)
Why use this?
- Decoupling: Your training loop doesn't know (or care) if it's training a Neural Net or a Logistic Regression. It just calls
.fit(). - Scalability: You can add new models to the factory without touching the training loop code (Open/Closed Principle).
- Configurability: You can drive your entire pipeline from a YAML file.
Design patterns give you a vocabulary to discuss and structure complex systems.