Page MenuHomePhabricator

Remove reexecution_config, add memoization strategy for reexecution
ClosedPublic

Authored by prha on Oct 30 2019, 11:42 PM.

Details

Summary

This diff restructures the reexecution flow to use previous_run_id and the existing step_key arguments on ExecutionParams instead of a separate ReexecutionConfig. It moves the step_keys_to_execute onto ExecutionPlan to thread through the APIs and engine. Adds a RetryMemoization class that handles the intermediate management across runs.

This contains a lot of the major refactor changes in D1294 (minus the resume/retry), but builds upon the RunConfig => PipelineRun change from D1329.

Test Plan

bk

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

prha created this revision.Oct 30 2019, 11:42 PM
prha updated this revision to Diff 6103.Oct 31 2019, 12:12 AM

updated aws tests, reran graphql type gen

prha edited the summary of this revision. (Show Details)Oct 31 2019, 6:11 AM
prha added a reviewer: Restricted Project.

This is the Lord's work. This looks great. Let me stew on it for a hot but I think this is quite the improvement.

nate added a subscriber: nate.Oct 31 2019, 4:39 PM

Amazing. thanks for doing all this cleanup!

schrockn requested changes to this revision.Oct 31 2019, 6:56 PM

This is a huge improvement. My questions/concerns are mostly around the memoization strategy class.

python_modules/dagster/dagster/core/execution/api.py
258

If there are "missing steps" which is the execution plan ever constructed?

288

so much better

python_modules/dagster/dagster/core/execution/memoization.py
51

manage_intermediates is a bit vague. had to read the function to understand why you named it as such.

I'm actually having a hard time following the function. i think a docblock describing the logic could be very useful.

71โ€“79

a comment or encapsulating this (this being what is in the if statement) in function with a descriptive name could be more clear

88โ€“98

I think we should resolve the TODO before we merge. This could easily become cost-prohibitive.

This revision now requires changes to proceed.Oct 31 2019, 6:56 PM
prha added inline comments.Oct 31 2019, 8:57 PM
python_modules/dagster/dagster/core/execution/api.py
258

Yeah, I'll move this to the PlanBuilder

python_modules/dagster/dagster/core/execution/memoization.py
51

๐Ÿ‘

71โ€“79

๐Ÿ‘

88โ€“98

๐Ÿ‘

prha updated this revision to Diff 6161.Nov 1 2019, 7:05 PM

refactor memoization strategy, add single-step reexecution test

prha planned changes to this revision.Nov 1 2019, 7:11 PM
prha updated this revision to Diff 6162.Nov 1 2019, 7:36 PM

move missing step check to execution plan constructor

prha updated this revision to Diff 6177.Nov 4 2019, 4:24 PM

rebase

alangenfeld added inline comments.
js_modules/dagit/src/schema.graphql
225

my last hang up is if this name sufficiently communicates whats going to happen if you set it. Some options would be a better name, a sibling arg, or a wrapper arg to communicate that this implies that we are retrying based on the results of said previous run.

prha planned changes to this revision.Nov 4 2019, 7:24 PM
prha added inline comments.
js_modules/dagit/src/schema.graphql
225

Yeah, I erred on the side of just making the GraphQL API descriptive (e.g. "here is the previous run id"), allowing for different future behaviors in the memoization strategy (making it pluggable).

I feel comfortable renaming this retryRunId or something like that until we have a better sense of what those other behaviors look like.

prha updated this revision to Diff 6185.Nov 4 2019, 7:40 PM

rename graphql parameter previousRunId to retryRunId

alangenfeld accepted this revision.Nov 4 2019, 9:29 PM

A+ diff would review again

redfordnod

prha updated this revision to Diff 6197.Nov 4 2019, 9:52 PM

handle multiprocess intermediates

schrockn requested changes to this revision.Nov 5 2019, 1:21 AM
schrockn added inline comments.
python_modules/dagster/dagster/core/execution/api.py
270

why is this a separate function?

python_modules/dagster/dagster/core/execution/memoization.py
14

per slack thread let's just make this into a single function copy_required_intermediates_for_execution or something similar

31

can we just put this into init?

This revision now requires changes to proceed.Nov 5 2019, 1:21 AM
prha updated this revision to Diff 6224.Nov 5 2019, 8:06 AM
  • move from class to simple functions
max added a subscriber: max.Nov 6 2019, 5:12 PM
max added inline comments.
js_modules/dagit/src/schema.graphql
225

or indeed parentRunId? but actually i don't think i understand the difference between retryRunId and previous_run_id

python_modules/dagster/dagster/core/execution/memoization.py
23

it might be nice to print the name or class of the intermediates manager here to aid debugging

max resigned from this revision.Nov 6 2019, 5:13 PM
This revision is now accepted and ready to land.Nov 6 2019, 6:54 PM
prha updated this revision to Diff 6299.Nov 6 2019, 7:04 PM

add intermediates manager class name in error message

prha updated this revision to Diff 6306.Nov 6 2019, 9:34 PM

rebase