Overview¶
Run log stores a lot of information about the execution along with the metrics captured during the execution of the pipeline.
Example¶
The highlighted lines in the below example show how to use the API
Any pydantic model as a value would be dumped as a dict, respecting the alias, before tracking it.
You can run this example by python run examples/concepts/experiment_tracking_api.py
The highlighted lines in the below example show how to use environment variables to track metrics.
Only string values are allowed to be environment variables. Numeric values sent in as strings are converted to int/float before storing them as metrics.
There is no support for boolean values in environment variables.
Any experiment tracking metrics found during the execution of the task are stored in
user_defined_metrics
field of the step log.
For example, below is the content for the shell execution.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 |
|
Incremental tracking¶
It is possible to track metrics over time within a task. To do so, use the step
parameter in the API
or post-fixing _STEP_
and the increment when using environment variables.
The step is defaulted to be 0.
Example¶
The highlighted lines in the below example show how to use the API with the step parameter.
You can run this example by python run examples/concepts/experiment_tracking_step.py
The highlighted lines in the below example show how to use environment variables to track metrics.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 |
|
Experiment tracking tools¶
Opt out
Pipelines need not use the experiment-tracking
if the preferred tools of choice is
not implemented in runnable. The default configuration of do-nothing
is no-op by design.
We kindly request to raise a feature request to make us aware of the eco-system.
The default experiment tracking tool of runnable is a no-op as the run log
captures all the
required details. To make it compatible with other experiment tracking tools like
mlflow or
Weights and Biases, we map attributes of runnable
to the underlying tool.
For example, for mlflow:
-
Any numeric (int/float) observation is logged as a metric with a step.
-
Any non numeric observation is logged as a parameter. Since mlflow does not support step wise logging of parameters, the key name is formatted as
key_step
. -
The tag associate with an execution is used as the experiment name.
Shortcomings
Experiment tracking capabilities of runnable are inferior in integration with popular python frameworks like pytorch and tensorflow as compared to other experiment tracking tools.
We strongly advise to use them if you need advanced capabilities.
In the below configuration, the mlflow tracking server is a local instance listening on port 8080.
As with other examples, we are using the track_this
python API to capture metrics. During the pipeline
execution in line #39, we use the configuration of mlflow
as experiment tracking tool.
The tag provided during the execution is used as a experiment name in mlflow.
You can run this example by python run examples/concepts/experiment_tracking_integration.py
To provide implementation specific capabilities, we also provide a python API to obtain the client context. The default client context is a null context manager.