Run Log Store Configuration¶
Run logs make reproducibility simple - they capture everything needed to understand, debug, and recreate your pipeline executions.
Why Run Logs Matter for Reproducibility¶
Complete Execution History
Every run is fully documented: Run logs capture the complete context of your pipeline execution
- π Parameters and inputs: What data and settings were used
- π Execution timeline: When each step ran and how long it took
- πΎ Data lineage: Which data artifacts were created and consumed
- π Code snapshots: Git commits and code versions used
- β Failure details: Exact error messages and stack traces
- π·οΈ Environment metadata: Configuration and infrastructure used
Reproducibility Made Easy
Run logs enable you to:
- Debug production failures by recreating exact conditions locally
- Compare experiments across different parameter sets or code versions
- Audit model training with complete training history and data lineage
- Resume failed pipelines from the exact point of failure
Available Run Log Stores¶
| Store Type | Environment | Best For |
|---|---|---|
buffered |
In-memory only | Quick testing and development |
file-system |
Any environment with mounted log_folder | Sequential execution, simple setup |
chunked-fs |
Any environment with mounted log_folder | Parallel execution, universal choice |
minio / chunked-minio |
Object storage | Distributed systems without shared filesystem |
buffered¶
Stores run logs in-memory only. No persistence - data is lost when execution completes.
In-Memory Only
- No persistence: Run logs are lost after execution
- Testing only: Not suitable for production or reproducibility
- No parallel support: Race conditions occur with concurrent execution
Use case: Quick testing and debugging during development.
Configuration¶
file-system¶
Stores run logs as single JSON files in the filesystem - simple and reliable for sequential execution.
Works Everywhere with Mounted Storage
Runs in any environment where log_folder is accessible
- πΎ Persistent storage: Run logs saved to mounted filesystem
- π Simple structure: One JSON file per pipeline run
- π Easy debugging: Human-readable JSON format
- π Local development: Direct filesystem access
- π³ Containers: Works with volume mounts
- βΈοΈ Kubernetes: Works with persistent volumes
Sequential Only
Not suitable for parallel execution - use chunked-fs for parallel workflows
Configuration¶
run-log-store:
type: file-system
config:
log_folder: ".run_log_store" # Optional: defaults to ".run_log_store"
Example¶
Run with file-system logging:
Result: Run log stored as .run_log_store/{run_id}.json with complete execution metadata for reproducibility.
chunked-fs¶
Thread-safe run log store - works everywhere with parallel execution support. The recommended choice for most use cases.
Works Everywhere with Mounted Storage
Runs in any environment where log_folder is accessible
- β Thread-safe: Supports parallel execution without race conditions
- π Local development: Direct filesystem access
- π³ Containers: Works with volume mounts (Docker, local-container executor)
- βΈοΈ Kubernetes: Works with persistent volumes (Argo, k8s-job executor)
- β‘ Parallel execution: Enable
enable_parallel: truesafely - πΎ Persistent: Full reproducibility with detailed execution history
Recommended Default
Use chunked-fs unless you have specific requirements - it provides parallel safety and works in all execution environments where the log_folder can be mounted.
Configuration¶
run-log-store:
type: chunked-fs
config:
log_folder: ".run_log_store" # Optional: defaults to ".run_log_store"
Example¶
from runnable import Pipeline, PythonTask, Parallel
from examples.common.functions import hello
def main():
# Parallel execution safe with chunked-fs
parallel_node = Parallel(
name="parallel_tasks",
branches={
"task_a": PythonTask(function=hello, name="hello_a"),
"task_b": PythonTask(function=hello, name="hello_b")
}
)
pipeline = Pipeline(steps=[parallel_node])
pipeline.execute()
return pipeline
if __name__ == "__main__":
main()
Run with chunked-fs logging:
Result: Run logs stored as separate files in .run_log_store/{run_id}/ directory:
RunLog.json- Pipeline metadata and configurationStepLog-{step}-{timestamp}.json- Individual step execution details
This chunked structure enables thread-safe parallel writes while maintaining complete execution history for reproducibility.
Object Storage (minio / chunked-minio)¶
For distributed systems and cloud deployments, use object storage-based run log stores:
minio¶
run-log-store:
type: minio
config:
endpoint: "https://s3.amazonaws.com"
access_key: "your-access-key"
secret_key: "your-secret-key"
bucket_name: "runnable-logs"
chunked-minio (Recommended)¶
run-log-store:
type: chunked-minio
config:
endpoint: "https://s3.amazonaws.com"
access_key: "your-access-key"
secret_key: "your-secret-key"
bucket_name: "runnable-logs"
Cloud Deployment
Use chunked-minio for distributed systems - it provides the same parallel execution safety as chunked-fs but with cloud storage scalability.
Choosing the Right Run Log Store¶
Decision Guide
For most users: Use chunked-fs - works in any environment with mounted storage and supports parallel execution
For development/testing: Use buffered for quick iterations where persistence isn't needed
Sequential workflows: Use file-system - works in any environment with mounted storage but only for sequential execution
Distributed systems without shared filesystem: Use chunked-minio when execution environments can't mount a shared log_folder
Filesystem vs Object Storage
Filesystem stores (file-system, chunked-fs): Work in any execution environment where the log_folder can be mounted
- β Local development (direct filesystem access)
- β Docker containers (volume mounts)
- β Kubernetes (persistent volumes)
- β Any containerized environment with volume mounting
Object storage (minio, chunked-minio): Use when shared filesystem mounting isn't available
Remember: Run logs are your key to reproducibility - they capture everything needed to understand, debug, and recreate your pipeline executions.
Custom Run Log Stores¶
Need to integrate with your existing logging infrastructure? Build custom run log stores that send execution data anywhere using Runnable's extensible architecture.
Enterprise Integration
Integrate with your existing systems: Never be limited by built-in storage options
- π Enterprise logging: Send to Splunk, ELK Stack, Datadog, New Relic
- π’ Corporate databases: Store in existing data warehouses, time-series databases
- π Compliance systems: Meet audit and governance requirements
- π Multi-region storage: Distribute logs across geographic regions
Building Custom Run Log Stores¶
Learn how to create production-ready custom run log stores:
π Custom Run Log Stores Development Guide
The guide provides:
- Complete stubbed implementation for database and cloud storage integration
- YAML to Pydantic configuration mapping with validation
- Storage system patterns for SQL, NoSQL, and cloud storage
- Performance optimization for high-volume deployments
Quick Example
Create a custom run log store in just 3 steps:
- Implement key methods by extending
BaseRunLogStore - Register via entry point in your
pyproject.toml - Configure via YAML for seamless integration
Ready to build? See the development guide for implementation patterns and examples.