Getting Started Tutorial¶
Transform a simple machine learning function into a production-ready pipeline, solving real challenges along the way.
What You'll Build¶
By the end of this tutorial, you'll have:
- ✅ Reproducible ML pipeline: Automatic tracking of all runs and results
- ✅ Configurable experiments: Change parameters without touching code
- ✅ Multi-step workflow: Data loading → preprocessing → training → evaluation
- ✅ Large dataset handling: Efficient storage and retrieval of data artifacts
- ✅ Shareable results: Model artifacts and metrics that persist between runs
- ✅ Deployment ready: Same pipeline runs on laptop, containers, or Kubernetes
The Journey¶
Each chapter tackles a real problem you'll face moving from "works on my laptop" to production:
- The Starting Point - A typical ML function with common problems
- Making It Reproducible - Track everything automatically
- Adding Flexibility - Configure without code changes
- Connecting the Workflow - Multi-step ML pipeline
- Handling Large Datasets - Efficient data management
- Sharing Results - Persistent model artifacts and metrics
- Running Anywhere - Same code, different environments
Prerequisites¶
- Basic Python knowledge
- Familiarity with scikit-learn (we'll use simple examples)
- Python environment with runnable installed:
pip install runnable[examples]
Time Investment: ~30-45 minutes total, designed for step-by-step learning
Ready to start? → The Starting Point