Page MenuHomePhabricator

[1/4] Reconstruct ExecutionPlan from the ExecutionPlanSnapshot in the run worker if one exists
AbandonedPublic

Authored by dgibson on Jan 23 2021, 10:45 PM.

Details

Reviewers
alangenfeld
Summary

When a persisted ExecutionPlanSnapshot is on the run, re-create it from the snapshot rather than rebuilding it from scratch. This is one step on the journey towards being able to have a run worker that doesn't execute any user code and could run in the Dagster cloud, interfacing with step workers in the user cloud.

Requires moving a whole bunch of user code out of ExecutionStep/StepInput/StepOutput and just putting handles on those classes instead.

(@alangenfeld this is the result of rebasing past the mypy stuff and incorporating your feedback from the previous stack - this worked out simpler, way less new stuff on the snapshots, and no need to use the PipelineSnapshot at all. Made StepInputSource and StepOutput serializable and put them on the executionplansnapshot)

Test Plan

Integration

Diff Detail

Repository
R1 dagster
Branch
rmpipelinedefsquash2
Lint
Lint OK
Unit
No Unit Test Coverage

Event Timeline

Harbormaster returned this revision to the author for changes because remote builds failed.Jan 23 2021, 11:04 PM
Harbormaster failed remote builds in B24778: Diff 30174!
Harbormaster returned this revision to the author for changes because remote builds failed.Jan 23 2021, 11:31 PM
Harbormaster failed remote builds in B24779: Diff 30175!
Harbormaster returned this revision to the author for changes because remote builds failed.Jan 24 2021, 4:34 AM
Harbormaster failed remote builds in B24788: Diff 30187!
dgibson retitled this revision from (Squashed) Reconstruct ExecutionPlan from the ExecutionPlanSnapshot to Reconstruct ExecutionPlan from the ExecutionPlanSnapshot in the run worker if one exists.Jan 24 2021, 11:26 PM
dgibson edited the summary of this revision. (Show Details)
dgibson added a reviewer: alangenfeld.
dgibson added subscribers: alangenfeld, schrockn, max, johann.
dgibson retitled this revision from Reconstruct ExecutionPlan from the ExecutionPlanSnapshot in the run worker if one exists to [1/4] Reconstruct ExecutionPlan from the ExecutionPlanSnapshot in the run worker if one exists.Jan 25 2021, 6:38 PM

This is a pretty effective test that we are stripping the user code out, but I'm not sure the cost benefit makes sense to land it. From what I can tell there is no real tangible benefit to building from the plan snapshot, but there are some real costs:

  • persisting these data structures that we will have to support moving forward
  • compat around changing the execution plan snapshot at all
    • In a world where we try to load from the plan snap, what happens if i execute from an a 0.10.0 dagit and a 0.10.X (where this is in) user process picks up execution? We likely pay some complexity cost to deal with this even if its swallowing all exceptions and fall back to recreating.

These costs are not enormous, but its enough to make me hesitant. I think its at least worth splitting this diff in to:

  • all the good for user/host code changes

and

  • @whitelist_for_serdes additions, plan snap changes, and the rebuild instead of re-create code path

We can discuss the recreate path more in its own diff and let all these other changes land right away.

python_modules/dagster/dagster/core/execution/plan/inputs.py
325–329

we shouldn't persist the value - lets just fetch it again off of the correct definition instance

python_modules/dagster/dagster/core/execution/plan/outputs.py
20

are we sure we need this persisted / available in the host process?

python_modules/dagster/dagster/core/snap/execution_plan_snapshot.py
116–128

theres info dupe across this wrapper object and StepOutput - feels like we should have one or the other

This revision now requires changes to proceed.Jan 25 2021, 11:46 PM

split into two diffs - https://dagster.phacility.com/D6155 removes the user code, this adds the ability to rebuild the plan from the snapshot.

python_modules/dagster/dagster/core/execution/plan/outputs.py
20

took out io_manager_key, at least for now

python_modules/dagster/dagster/core/snap/execution_plan_snapshot.py
116–128

I only kept name for back-compat. until serdes alembic comes out is there a better way to handle that?

putting in queue for now since host mode execution is not pressing from what I understand

This revision now requires changes to proceed.Mon, Feb 1, 6:24 PM