Sharing Results¶
You've trained a great model! But what happens next? Your teammate needs to use it, or you want to compare today's model with yesterday's. Let's make your results persistent and shareable.
The Disappearing Results Problem¶
After running your pipeline, where are your results?
# Train a model...
pipeline.execute()
# Great! But where is the model now?
# Can your teammate use it?
# Can you compare with yesterday's run?
Problems:
- Models only exist during execution
- No way to track metrics over time
- Can't share trained models with teammates
- Hard to compare different runs
The Solution: Persistent Storage¶
Let's save models and metrics that persist beyond execution:
from runnable import Pipeline, PythonTask, Catalog, metric
import pickle
def save_model_artifact(model_data):
"""Save trained model to a file."""
with open("trained_model.pkl", "wb") as f:
pickle.dump(model_data, f)
return model_data
PythonTask(
function=save_model_artifact,
name="save_model",
catalog=Catalog(put=["trained_model.pkl"]), # Store in catalog
returns=[pickled("model_data")]
)
Try it:
Storing Model Artifacts¶
Save your trained models so they can be reused:
def save_model_artifact(model_data):
"""Save model to file for sharing."""
with open("trained_model.pkl", "wb") as f:
pickle.dump(model_data, f)
print(f"Model saved: trained_model.pkl")
return model_data
# Store the model file in catalog
PythonTask(
function=save_model_artifact,
name="save_model",
catalog=Catalog(put=["trained_model.pkl"])
)
What happens:
- Model is saved to
trained_model.pklin your working directory - Runnable copies it to
.catalog/for permanent storage - File is now available for future runs or teammates
Tracking Metrics Over Time¶
Use metric() to track performance metrics:
def save_evaluation_metrics(evaluation_results):
"""Save metrics for tracking."""
accuracy = evaluation_results['accuracy']
report = evaluation_results['classification_report']
# Save detailed report
with open("evaluation_report.json", "w") as f:
json.dump(report, f, indent=2)
# Return metrics for tracking
return {
"accuracy": accuracy,
"precision": report['weighted avg']['precision'],
"recall": report['weighted avg']['recall'],
"f1_score": report['weighted avg']['f1-score']
}
PythonTask(
function=save_evaluation_metrics,
name="save_metrics",
catalog=Catalog(put=["evaluation_report.json"]),
returns=[
metric("accuracy"), # Tracked as metrics
metric("precision"),
metric("recall"),
metric("f1_score")
]
)
Metrics are special:
- Automatically tracked in run logs
- Easy to compare across runs
- Can be visualized over time
- Help identify model improvements
Loading Saved Models¶
Your teammate (or you in a future run) can load the saved model:
def load_and_verify_model():
"""Load a previously saved model."""
with open("trained_model.pkl", "rb") as f:
model_data = pickle.load(f)
print(f"Model loaded successfully!")
return {"model_verified": True}
PythonTask(
function=load_and_verify_model,
name="verify_model",
catalog=Catalog(get=["trained_model.pkl"]) # Get the saved model
)
Complete Pipeline with Persistent Storage¶
Here's the full pipeline that saves and shares results:
def main():
"""Demonstrate persistent storage of models and metrics."""
print("=" * 50)
print("Chapter 6: Sharing Results")
print("=" * 50)
pipeline = Pipeline(steps=[
# Load and preprocess data
PythonTask(
function=load_data,
name="load_data",
returns=[pickled("df")]
),
PythonTask(
function=preprocess_data,
name="preprocess",
returns=[pickled("preprocessed_data")]
),
# Train model
PythonTask(
function=train_model,
name="train",
returns=[pickled("model_data")]
),
# Save model artifact to catalog
PythonTask(
function=save_model_artifact,
name="save_model",
catalog=Catalog(put=["trained_model.pkl"]), # Store model file
returns=[pickled("model_data")]
),
# Evaluate and save metrics
PythonTask(
function=evaluate_model,
name="evaluate",
returns=[pickled("evaluation_results")]
),
PythonTask(
function=save_evaluation_metrics,
name="save_metrics",
catalog=Catalog(put=["evaluation_report.json", "metrics_summary.json"]),
returns=[
metric("accuracy"),
metric("precision"),
metric("recall"),
metric("f1_score")
]
),
# Verify model can be loaded (simulating another run or teammate)
PythonTask(
function=load_and_verify_model,
name="verify_model",
catalog=Catalog(get=["trained_model.pkl"]), # Get the saved model
returns=[pickled("verification")]
)
])
pipeline.execute()
print("\n" + "=" * 50)
print("Persistent storage benefits:")
What You Get with Persistent Storage¶
📦 Shareable Model Artifacts¶
Models are stored in the catalog and can be shared:
# Your trained model is here
ls .catalog/*/trained_model.pkl
# Teammate can copy and use it
cp .catalog/run-id-123/trained_model.pkl shared-models/
📊 Metrics Tracking¶
All metrics are tracked in the run log:
{
"run_id": "happy-euler-0123",
"metrics": {
"accuracy": 0.9234,
"precision": 0.9156,
"recall": 0.9234,
"f1_score": 0.9189
}
}
📈 Performance History¶
Compare different runs:
# See all your runs
ls .run_log_store/
# Compare metrics from different runs
cat .run_log_store/run-1/run_log.json | grep accuracy
cat .run_log_store/run-2/run_log.json | grep accuracy
🤝 Team Collaboration¶
- Share trained models without retraining
- Compare your model with teammate's models
- Reuse preprocessing results
- Track team's overall progress
Storage Locations¶
Runnable keeps everything organized:
.catalog/ # Persistent file storage
├── run-id-123/
│ ├── trained_model.pkl # Your model
│ ├── evaluation_report.json # Detailed metrics
│ └── metrics_summary.json # Quick summary
└── run-id-124/
└── trained_model.pkl # Next run's model
.run_log_store/ # Execution metadata
├── run-id-123/
│ └── run_log.json # Includes metrics
└── run-id-124/
└── run_log.json
Compare: Transient vs Persistent¶
Transient Results (Chapters 1-5):
- ❌ Results lost after execution
- ❌ Can't compare different runs
- ❌ Can't share with teammates
- ❌ Must retrain to reuse model
Persistent Results (Chapter 6):
- ✅ Models saved permanently
- ✅ Metrics tracked over time
- ✅ Easy to share with team
- ✅ Reuse without retraining
- ✅ Performance history available
Real-World Use Cases¶
Model Versioning¶
# Save each model with version info
def save_model_v2(model_data):
with open(f"model_v2.pkl", "wb") as f:
pickle.dump(model_data, f)
catalog=Catalog(put=["model_v2.pkl"])
A/B Testing¶
Experiment Tracking¶
# Track different hyperparameters
returns=[
metric("accuracy"),
metric("n_estimators"), # Track hyperparameter used
metric("max_depth")
]
What's Next?¶
We have reproducible pipelines, flexible configuration, efficient data handling, and persistent results. But everything is running on your laptop. What about production?
Next chapter: We'll show how the same pipeline runs anywhere - your laptop, containers, or Kubernetes - without code changes.
Next: Running Anywhere - Same code, different environments