🔧 Pipeline Parameters & Environment¶

Configure Pipeline execution without changing code using parameters and environment variables.

Parameter System¶

Pipelines share the same flexible parameter system as Jobs, with three layers of configuration precedence:

Individual overrides: RUNNABLE_PRM_key="value" (highest priority)
Environment file: RUNNABLE_PARAMETERS_FILE="config.yaml"
Code-specified: pipeline.execute(parameters_file="config.yaml") (lowest priority)

from runnable import Pipeline, PythonTask

def process_data(batch_size=100, debug=False):
    print(f"Processing with batch_size={batch_size}, debug={debug}")
    return {"processed": True}

def main():
    pipeline = Pipeline(steps=[
        PythonTask(function=process_data, name="process")
    ])

    # Execute with parameter file
    pipeline.execute(parameters_file="config.yaml")
    return pipeline

if __name__ == "__main__":
    main()

Parameter File (config.yaml):

batch_size: 1000
debug: true

Environment Variable Overrides¶

Override any parameter at runtime:

# Override specific parameters
export RUNNABLE_PRM_batch_size=500
export RUNNABLE_PRM_debug=false

# Run pipeline - uses overridden values
uv run data_pipeline.py

Custom Run IDs¶

Control pipeline execution tracking with custom identifiers:

# Set custom run ID for tracking
export RUNNABLE_RUN_ID="data-pipeline-daily-run-2024-11-20"
uv run data_processing_pipeline.py

Benefits:

Easy identification in logs and run history
Consistent naming across pipeline executions
Better debugging when tracking specific pipeline runs
Integration with external systems using predictable IDs

Pipeline Run ID Examples¶

# Daily data processing
export RUNNABLE_RUN_ID="etl-daily-$(date +%Y-%m-%d)"
uv run daily_etl_pipeline.py

# Experiment tracking
export RUNNABLE_RUN_ID="experiment-feature-engineering-v2"
uv run ml_experiment_pipeline.py

# Environment-specific runs
export RUNNABLE_RUN_ID="staging-deployment-$(git rev-parse --short HEAD)"
uv run deployment_pipeline.py

Default vs Custom Run IDs

Without RUNNABLE_RUN_ID: Auto-generated names like courtly-easley-1719

With RUNNABLE_RUN_ID: Your custom identifier data-pipeline-daily-run-2024-11-20

Dynamic Parameter Files¶

Switch configurations without code changes:

# Development environment
export RUNNABLE_PARAMETERS_FILE="configs/dev.yaml"
uv run ml_pipeline.py

# Production environment
export RUNNABLE_PARAMETERS_FILE="configs/prod.yaml"
uv run ml_pipeline.py  # Same code, different config!

Common Pipeline Patterns¶

Environment-Specific Configurations¶

Development (dev.yaml):

data_source: "s3://dev-bucket/sample-data/"
batch_size: 100
debug: true
parallel_workers: 1

Production (prod.yaml):

data_source: "s3://prod-bucket/full-data/"
batch_size: 10000
debug: false
parallel_workers: 8

Multi-Stage Pipeline Configuration¶

# Configure entire pipeline execution
export RUNNABLE_PARAMETERS_FILE="configs/full-pipeline.yaml"
export RUNNABLE_RUN_ID="daily-ml-pipeline-$(date +%Y%m%d)"

# Override specific stages
export RUNNABLE_PRM_training_epochs=100
export RUNNABLE_PRM_validation_split=0.2

uv run ml_training_pipeline.py

Best Practices¶

✅ Use Run IDs for Pipeline Tracking¶

# Predictable naming for scheduled runs
export RUNNABLE_RUN_ID="weekly-report-$(date +%Y-week-%U)"

# Git-based versioning for deployments
export RUNNABLE_RUN_ID="deploy-$(git rev-parse --short HEAD)"

# Feature branch testing
export RUNNABLE_RUN_ID="test-$(git branch --show-current)-$(date +%s)"

✅ Environment Variables for Deployment¶

# Production deployment values
export RUNNABLE_PRM_database_url="postgresql://prod:5432/warehouse"
export RUNNABLE_PRM_s3_bucket="company-prod-data"
export RUNNABLE_PRM_notification_webhook="https://alerts.company.com/pipeline"

✅ Layered Configuration Strategy¶

def main():
    pipeline = Pipeline(steps=[...])

    # 1. Base configuration in code
    pipeline.execute(parameters_file="base_config.yaml")

    # 2. Environment-specific overrides via RUNNABLE_PARAMETERS_FILE
    # 3. Runtime tweaks via RUNNABLE_PRM_* variables
    # 4. Tracking via RUNNABLE_RUN_ID

    return pipeline

Shared Parameter System

Pipelines use the exact same parameter system as Jobs. Once you learn parameters for Jobs, you already know how to configure Pipelines!

What's Next?¶

Jobs vs Pipelines - When to use which approach
Task Types - Different ways to define pipeline steps