Page MenuHomeElementl

[Asset Store Versioning 1 of 3] Place computed step output versions on the execution plan.
ClosedPublic

Authored by cdecarolis on Nov 10 2020, 5:12 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, May 15, 12:32 PM
Unknown Object (File)
Sun, May 14, 9:25 PM
Unknown Object (File)
Sat, May 13, 6:17 AM
Unknown Object (File)
Fri, May 12, 11:57 PM
Unknown Object (File)
Apr 27 2023, 3:10 PM
Unknown Object (File)
Apr 22 2023, 3:04 AM
Unknown Object (File)
Apr 9 2023, 2:16 AM
Unknown Object (File)
Apr 7 2023, 3:58 AM
Subscribers

Details

Summary

Adding the step output versions to the execution plan allows them to be accessible at runtime through a SystemStepExecutionContext, which enables them to be used in an asset store's implementation.

Test Plan

Unit tests

Diff Detail

Repository
R1 dagster
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

This might work for the in-process executor, but I suspect that, for other executors, this will encounter difficulties, because the execution plans inside the steps will be constructed via a path that doesn't involve resolve_memoized_execution_plan.

More generally - ExecutionPlans are produced and consumed from a variety of locations in the codebase. If the presence of the step output versions depends on where the execution plan was created, it requires readers to trace back to where it was created in order to understand how it can be used.

I wonder if it instead makes sense to resolve step versions inside the execution plan as part of its construction.

This revision now requires changes to proceed.Nov 10 2020, 9:00 PM

Placed computed fxn to compute step output versions on execution plan using information already provided in step builder

This might work for the in-process executor, but I suspect that, for other executors, this will encounter difficulties, because the execution plans inside the steps will be constructed via a path that doesn't involve resolve_memoized_execution_plan.

More generally - ExecutionPlans are produced and consumed from a variety of locations in the codebase. If the presence of the step output versions depends on where the execution plan was created, it requires readers to trace back to where it was created in order to understand how it can be used.

I wonder if it instead makes sense to resolve step versions inside the execution plan as part of its construction.

This makes a lot of sense. What I opted for was providing all the parameters necessary to construct the execution plan, and then constructing it on-the-fly on the instance. I would like to make it a cached property on the execution plan, although I'm not sure what patterns we use for that.

sandyryza added inline comments.
python_modules/dagster/dagster/core/execution/plan/plan.py
290

@alangenfeld - do you have thoughts about adding environment config and mode definition to the ExecutionPlan?

If that's something we want to avoid, another option could be for resolve_step_output_versions to still accept these arguments?

413

This isn't an arg, is it?

440

Prefer check.invariant to assert

454

plan.py is a fairly huge file - even with resolve_step_output_versions as a method on ExecutionPlan, I think it's still helpful organizationally to keep all the details of its implementation in a separate file.

python_modules/dagster/dagster/core/execution/plan/plan.py
290

the mode_def it self is already held by the pipeline, might make sense to stash away mode_name / mode_key on EnvironmentConfig (its already passed in to the static build method) and have that be the only thing passed and held here

I am happy with these changes! Will defer to @alangenfeld on the execution plan stuff.

This revision is now accepted and ready to land.Nov 16 2020, 7:13 PM

Addressed comments. Moved version resolution logic off of the execution plan. passed mode as a param of environment config instead of mode_definition to execution plan. Got rid of outdated documentation. Assert -> check invariant.

Rebased on top of AssetStoreContext changes