mocked
Mocked executors provide a way to control the behavior of task
node types to be either
pass through or execute a alternate command with modified configurations.
- Runs the pipeline only in local environment.
- Enables unit testing of the pipeline in both yaml and SDK definitions.
- Isolates specific node(s) from the execution for further analysis.
- Not meant to be used for production deployments
Options¶
By default, all the task
steps are passed through without an execution.
By providing patches
, indexed by the name of the node, gives control on the command
to run and the configuration of the command.
Command configuration for notebook nodes¶
python
and shell
based tasks have no configuration options apart from the command
.
Notebook nodes have additional configuration options detailed in concepts.
Ploomber engine provides rich options in debugging failed notebooks.
Example¶
Mocking nodes¶
The following example shows the simple case of mocking all the steps of the pipeline.
You can execute the mocked pipeline by:
runnable execute -f examples/concepts/simple.yaml -c examples/configs/mocked-config-simple.yaml
You can execute the mocked pipeline by:
runnable_CONFIGURATION_FILE=examples/configs/mocked-config-simple.yaml python examples/concepts/simple.py
The flag mock
is set to be true
for the execution of node simple which
denotes that the task was mocked.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 |
|
Patching nodes for unit testing¶
Pipelines are themselves code and should be testable. In the below example, we take an example pipeline to test the behavior of the traversal.
The below pipeline is designed to follow: step 1 >> step 2 >> step 3
in case of no failures
and step 1 >> step3
in case of failure. The traversal is
shown in concepts.
Asserting Run log
The run log is a simple json file that can be parsed and validated against designed
behaviors. You can also create the RunLog
object by deserializing
runnable.datastore.RunLog
from the json.
This can be handy when validating complex pipelines.
The run log
has only step 1
and step 3
as part of the steps (as designed)
showing the behavior of the pipeline in case of failure. The status of step 1
is
captured as FAIL
due to exit 1
command in the pipeline definition.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 |
|
We can patch the command of step 1 to be successful to test the behavior of traversal in case of no failures.
Running the pipeline with mocked configuration:
for yaml: runnable execute -f examples/on-failure.yaml -c examples/configs/mocked-config-unittest.yaml
for python: runnable_CONFIGURATION_FILE=examples/configs/mocked-config-unittest.yaml python examples/on_failure.py
As seen in the run log
, the steps have step 1
, step 2
, step 3
as
executed and successful steps. And the status of step 1
is SUCCESS
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 |
|
Debugging failed executions¶
Using debuggers
For pipelines defined by the python SDK, you can create breakpoints at the python function being executed and use debuggers.
For notebook
based tasks,
refer to ploomber engine documentation for rich debuggers.
Shell commands can be run in isolation by providing the parameters as environment variables
and catalog artifacts present in the compute_data_folder
location.
To debug a failed execution, we can use the mocked executor to mock all the steps except
for the failed step and providing the parameters and data exposed to the step during the
failure which are captured by the run log
and catalog
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 |
|
Copy the catalog during the failed execution to the debugging execution and
retry the step. We give it a run_id debug-pipeline
cp .catalog/wrong-file-name debug-pipeline
and retry with the fix:
runnable execute -f examples/retry-fail.yaml -c examples/configs/mocked-config-debug.yaml
--run-id debug-pipeline