Page MenuHomeElementl

[RFC] Add all executable steps to known execution state

Authored by rexledesma on Apr 8 2021, 3:11 AM.




The run above tries to read from the parent run's fs storage, even though all steps were selected
for re-execution. This was because the input manager determines where to read the upstream output
by checking if the execution plan has executed the upstream step (

If the execution plan is delegated to other machines and processes (say, in the multiprocess executor), the execution plan is no longer
the source of truth. Rather, we should rely on the known execution state is determine if the upstream step has been executed.

Here, we add all the executable steps to the known execution state and use that information to determine if a plan has executed a step.

Test Plan

added test cases failing on master, passing after this diff

Diff Detail

R1 dagster
rl/fix-multiprocess-load-input-execution (branched from master)
Lint Passed
No Test Coverage

Event Timeline

all_executable_steps -> step_keys_to_execute

Harbormaster returned this revision to the author for changes because remote builds failed.Apr 8 2021, 3:34 AM
Harbormaster failed remote builds in B28579: Diff 35077!

the diagnosis looks right to me

in terms of testing, you may also need to test the graphql query that does the re-execution for dagit. i.e. adding multiproc tests in
for more context, reexecute_pipeline is just the python api to reexecute a pipeline. the path dagit hits is`instance.submit_run` call directly. see in
the re-execution isn't very blessed atm bc we are duplicating logics in the react/gql and python path. where the paths get merged is the instance level (or maybe not anymore since we've separated submit_run and launch_run).

step_keys_to_execute is already part of PipelineRun right? If that's the case, would we be able to just access it "statically" instead of needing to pass it via KnownState?

+ the changes


delete this method


drop this

This revision now requires changes to proceed.Apr 8 2021, 5:24 PM

Dropping in favor of inspecting the pipeline run's step_keys_to_execute