Page MenuHomeElementl

add `from_failure` bool arg to rexecute pipeline apis
Changes PlannedPublic

Authored by yuhan on Jul 21 2021, 1:20 AM.

Details

Summary

also renamed methods on ExternalExecutionPlan to pair to the equivalent methods on ExecutionPlan (open to discussion)

Test Plan

bk

Diff Detail

Repository
R1 dagster
Branch
yuhan/reexecution-from-failure
Lint
Lint Passed
Unit
No Test Coverage

Event Timeline

Harbormaster returned this revision to the author for changes because remote builds failed.Jul 21 2021, 2:03 AM
Harbormaster failed remote builds in B33960: Diff 41948!
yuhan retitled this revision from add `from_failure` bool arg to rexecute_pipeline apis to add `from_failure` bool arg to rexecute pipeline apis.Jul 21 2021, 6:13 PM
yuhan edited the summary of this revision. (Show Details)
yuhan added reviewers: prha, alangenfeld, max.
yuhan requested review of this revision.Jul 21 2021, 6:54 PM

I would like to discuss alternative ways to expose this. My hesitation towards this approach comes from using a bool argument that conflicts with another argument [1]. I think this represents an ergonomic / feel issue as well as point at a lack of extensibility if we desire other times of re-execution where steps are selected by the system.

one alternative would be to extend step_selection to also take an enum entry or marker type for describing a step selection to be calculated by dagster.

open to other ideas and could also be convinced the bool is right, but i feel like theres a nicer solution to be had.

python_modules/dagster/dagster/core/execution/api.py
449–457

agh looks like this got accidentally out of sync in D7044

979–982

[1]

one alternative would be to extend step_selection to also take an enum entry or marker type for describing a step selection to be calculated by dagster.

we could

  • extend step_selection to take StepSelection.FROM_FAILURE), step_selection=StepSelection.from_selection(["some_solid*"]), and ["some_solid*"] (backcompat)
  • or a more generic arg reexecute_option and deprecate step_selection, it'd take ReexecuteOption.FROM_FAILURE, ReexecuteOption.from_selection(["some_solid*"]))

i could see another advantage of introducing a class (e.g. an enum entry) is that we could consolidate all the parsing/resolving logics into that class - for example, internally, we could pass around StepSelection.step_keys_to_execute as opposed to figure out whether the selection is pre-transformation or post-transformation.

but the cost is to expose a new top-level api and the api being verbose. i'm pretty much on the fence, slightly leaning towards the bool argument as it keeps the apis simple.

but the cost is to expose a new top-level api and the api being verbose

Ya that nails the trade-off here. I have a strong personal dislike of conflicting bool arguments, which is why I at least wanted to talk it out, but agree the cost of adding a new public enum/class isn't a slamdunk.

add some more reviewers for other opinions

Just to riff on options :

# current
reexecute_pipeline((
  my_pipeline,
  previous_run_id,
  from_failure=True # conflicty bool
)

# enum
reexecute_pipeline((
  my_pipeline,
  previous_run_id,
  step_selection=StepSelection.RETRY_FROM_FAILURE # new api surface area
)

# semantic run id arg
reexecute_pipeline((
  my_pipeline,
  retry_from_failed_run=previous_run_id, # still conflicty, have to make parent_run_id optional
)

# stringly
reexecute_pipeline((
  my_pipeline,
  previous_run_id,
  step_selection="retry_from_failure" # list distinguishes step selection syntax from "algorithm choice"
)
python_modules/dagster/dagster/core/execution/api.py
560–561

^ also needs to get re-corrected

i think the enum solution is the cleanest here; it certainly adds API surface but if we retain backcompatibility with the old list of strings, that feels fine -- it adds API surface in lockstep with new functionality

I'm going to defer to others on this

trying enum