🎯 Jobs vs Pipelines: When to Use Which?¶

Both jobs and pipelines run your functions. The difference is intent.

🎯 Jobs: "Run this once"¶

Perfect for standalone tasks:

from runnable import PythonJob

def analyze_sales_data():
    # Load data, run analysis, generate report
    return "Analysis complete!"

def main():
    # Job: Just run it
    job = PythonJob(function=analyze_sales_data)
    job.execute()
    return job  # REQUIRED: Always return the job object

if __name__ == "__main__":
    main()

See complete runnable code

examples/11-jobs/python_tasks.py

"""
You can execute this pipeline by:

    python examples/01-tasks/python_tasks.py

The stdout of "Hello World!" would be captured as execution
log and stored in the catalog.

An example of the catalog structure:

.catalog
└── baked-heyrovsky-0602
    └── hello.execution.log

2 directories, 1 file


The hello.execution.log has the captured stdout of "Hello World!".
"""

from examples.common.functions import hello
from runnable import PythonJob


def main():
    job = PythonJob(function=hello)

    job.execute()

    return job


if __name__ == "__main__":
    main()

Try it now:

uv run examples/11-jobs/python_tasks.py

When to use jobs:¶

One-off analysis: "Analyze this dataset"
Testing functions: "Does my code work?"
Standalone reports: "Generate monthly summary"
Data exploration: "What's in this file?"

🔗 Pipelines: "This is step X of many"¶

Perfect for multi-step workflows:

from runnable import Pipeline, PythonTask

def load_data():
    return {"users": 1000, "sales": 50000}

def clean_data(raw_data):
    return {"clean_users": raw_data["users"], "clean_sales": raw_data["sales"]}

def train_model(cleaned_data):
    return f"Model trained on {cleaned_data['clean_users']} users"

def main():
    # Pipeline: Chain them together
    pipeline = Pipeline(steps=[
        PythonTask(function=load_data, returns=["raw_data"]),
        PythonTask(function=clean_data, returns=["cleaned_data"]),
        PythonTask(function=train_model, returns=["model"])
    ])
    pipeline.execute()
    return pipeline  # REQUIRED: Always return the pipeline object

if __name__ == "__main__":
    main()

See complete runnable code

examples/03-parameters/passing_parameters_python.py

"""
The below example shows how to set/get parameters in python
tasks of the pipeline.

The function, set_parameter, returns
    - JSON serializable types
    - pydantic models
    - pandas dataframe, any "object" type

pydantic models are implicitly handled by runnable
but "object" types should be marked as "pickled".

Use pickled even for python data types is advised for
reasonably large collections.

Run the below example as:
    python examples/03-parameters/passing_parameters_python.py

"""

from examples.common.functions import read_parameter, write_parameter
from runnable import Pipeline, PythonTask, metric, pickled


def main():
    write_parameters = PythonTask(
        function=write_parameter,
        returns=[
            pickled("df"),
            "integer",
            "floater",
            "stringer",
            "pydantic_param",
            metric("score"),
        ],
        name="set_parameter",
    )

    read_parameters = PythonTask(
        function=read_parameter,
        terminate_with_success=True,
        name="get_parameters",
    )

    pipeline = Pipeline(
        steps=[write_parameters, read_parameters],
    )

    _ = pipeline.execute()

    return pipeline


if __name__ == "__main__":
    main()

Try it now:

uv run examples/03-parameters/passing_parameters_python.py

When to use pipelines:¶

Multi-step workflows: "Load → Clean → Train → Deploy"
Data pipelines: "Extract → Transform → Load"
Reproducible processes: "Run the same steps every time"
Complex dependencies: "Step 3 needs outputs from steps 1 and 2"

🔄 Same function, different contexts¶

Here's the same function used both ways:

def hello():
    "The most basic function"
    print("Hello World!")

As a job:

from runnable import PythonJob

def main():
    job = PythonJob(function=hello)
    job.execute()
    return job  # REQUIRED: Always return the job object

if __name__ == "__main__":
    main()

As a pipeline task:

from runnable import Pipeline, PythonTask

def main():
    task = PythonTask(function=hello, name="say_hello")
    pipeline = Pipeline(steps=[task])
    pipeline.execute()
    return pipeline  # REQUIRED: Always return the pipeline object

if __name__ == "__main__":
    main()

Quick decision guide¶

I want to...	Use
Test my function	Job
Run analysis once	Job
Generate a report	Job
Process data in multiple steps	Pipeline
Chain different functions	Pipeline
Run the same workflow repeatedly	Pipeline

You can always switch

Start with a job to test your function, then move it into a pipeline when you're ready to build a workflow.

Essential Pattern: Always Return Objects

Both jobs and pipelines must be returned from your main() function. This pattern is critical for:

🔍 Execution Tracking: Runnable tracks run status, timing, and metadata through the returned object

📊 Result Access: The returned object contains execution results, logs, and run IDs

🔗 Integration: External tools and monitoring systems need the object for further processing

🐛 Debugging: Error details and execution context are accessible via the returned object

❌ Missing returns break functionality:

def main():
    job = PythonJob(function=my_function)
    job.execute()
    # Missing return - loses execution tracking!

def main():
    pipeline = Pipeline(steps=[...])
    pipeline.execute()
    # Missing return - no access to results!

✅ Always use this pattern:

def main():
    job = PythonJob(function=my_function)
    job.execute()
    return job  # Essential for Runnable's execution model

def main():
    pipeline = Pipeline(steps=[...])
    pipeline.execute()
    return pipeline  # Required for result access and tracking

Custom Execution Models

Need to run jobs beyond Python, Shell, and Notebooks? Create custom task types and executors for any infrastructure or execution model using Runnable's extensible plugin architecture.

→ Custom Job Executors → Custom Pipeline Executors

What's Next?¶

Pipeline Parameters - Configure pipelines with parameters and custom run IDs
Task Types - Different ways to define pipeline steps (Python, notebooks, shell scripts)
Visualization - Visualize pipeline execution with interactive timelines