Page MenuHomePhabricator

[Asset Store Versioning 1 of 3] Place computed step output versions on the execution plan.
AcceptedPublic

Authored by cdecarolis on Tue, Nov 10, 5:12 PM.

Details

Reviewers
sandyryza
yuhan
Summary

Adding the step output versions to the execution plan allows them to be accessible at runtime through a SystemStepExecutionContext, which enables them to be used in an asset store's implementation.

Test Plan

Unit tests

Diff Detail

Repository
R1 dagster
Branch
output_versions_on_plan
Lint
Lint OK
Unit
No Unit Test Coverage

Event Timeline

This might work for the in-process executor, but I suspect that, for other executors, this will encounter difficulties, because the execution plans inside the steps will be constructed via a path that doesn't involve resolve_memoized_execution_plan.

More generally - ExecutionPlans are produced and consumed from a variety of locations in the codebase. If the presence of the step output versions depends on where the execution plan was created, it requires readers to trace back to where it was created in order to understand how it can be used.

I wonder if it instead makes sense to resolve step versions inside the execution plan as part of its construction.

This revision now requires changes to proceed.Tue, Nov 10, 9:00 PM

Placed computed fxn to compute step output versions on execution plan using information already provided in step builder

This might work for the in-process executor, but I suspect that, for other executors, this will encounter difficulties, because the execution plans inside the steps will be constructed via a path that doesn't involve resolve_memoized_execution_plan.

More generally - ExecutionPlans are produced and consumed from a variety of locations in the codebase. If the presence of the step output versions depends on where the execution plan was created, it requires readers to trace back to where it was created in order to understand how it can be used.

I wonder if it instead makes sense to resolve step versions inside the execution plan as part of its construction.

This makes a lot of sense. What I opted for was providing all the parameters necessary to construct the execution plan, and then constructing it on-the-fly on the instance. I would like to make it a cached property on the execution plan, although I'm not sure what patterns we use for that.

sandyryza added inline comments.
python_modules/dagster/dagster/core/execution/plan/plan.py
293

@alangenfeld - do you have thoughts about adding environment config and mode definition to the ExecutionPlan?

If that's something we want to avoid, another option could be for resolve_step_output_versions to still accept these arguments?

425

This isn't an arg, is it?

491

Prefer check.invariant to assert

505

plan.py is a fairly huge file - even with resolve_step_output_versions as a method on ExecutionPlan, I think it's still helpful organizationally to keep all the details of its implementation in a separate file.

python_modules/dagster/dagster/core/execution/plan/plan.py
293

the mode_def it self is already held by the pipeline, might make sense to stash away mode_name / mode_key on EnvironmentConfig (its already passed in to the static build method) and have that be the only thing passed and held here

I am happy with these changes! Will defer to @alangenfeld on the execution plan stuff.

This revision is now accepted and ready to land.Mon, Nov 16, 7:13 PM

Addressed comments. Moved version resolution logic off of the execution plan. passed mode as a param of environment config instead of mode_definition to execution plan. Got rid of outdated documentation. Assert -> check invariant.

Rebased on top of AssetStoreContext changes