Page MenuHomeElementl

Allow the set of default executors to use when none are specified to be customized
Changes PlannedPublic

Authored by alangenfeld on Apr 26 2021, 1:29 PM.

Details

Summary

One could imagine a custom instance class / run worker and step worker execution environment that has a different set of default executors. This diff passes down those default executors in a bunch of places rather than assuming that the default exeuctors are always in_process and multiprocess.

Test Plan

BK

Diff Detail

Repository
R1 dagster
Branch
defaultexecsquashed
Lint
Lint Passed
Unit
No Test Coverage

Event Timeline

Harbormaster returned this revision to the author for changes because remote builds failed.Apr 26 2021, 1:51 PM
Harbormaster failed remote builds in B29437: Diff 36139!
Harbormaster returned this revision to the author for changes because remote builds failed.Apr 26 2021, 2:13 PM
Harbormaster failed remote builds in B29439: Diff 36141!
dgibson published this revision for review.Apr 26 2021, 2:30 PM

spurious coverall test failure, will investigate separately

nearly all of this diff is passing things down from [2] and [3] so that they can be accessed in [1] (and customized in the near future)

python_modules/dagster/dagster/core/definitions/mode.py
84–93

[1]

python_modules/dagster/dagster/core/instance/__init__.py
1546–1548

[2] (I would rename default_executors if it wasn't a public-facing thing already)

python_modules/dagster/dagster/grpc/server.py
203–205

[3]

Is this sufficient - do you not need to be able to change the default io_manager resource as well [A]?

Having the pipeline snapshot and id change from something defined outside the pipeline gives me the heebie jeebies [B]

python_modules/dagster/dagster/core/definitions/mode.py
63–65

[A]

python_modules/dagster/dagster/core/definitions/pipeline.py
392–398

[B]

python_modules/dagster/dagster/core/instance/__init__.py
1546–1548

would this make more sense as a property of the run launcher or run coordinator? Should this be able to vary independently of those?

python_modules/dagster/dagster/core/definitions/mode.py
63–65

I don't think the default io manager needs to change for my purposes currently, no (although I could imagine wanting that in the future)

python_modules/dagster/dagster/core/instance/__init__.py
1546–1548

They seem like different concepts to me, although I could live with run launcher for my purposes. Run coordinator seems pretty different.

One case i thought of that I think communicates the cost of this change well is if we had a function for just doing run config validation against a target - it would now need to take the instance as an argument

check_run_config(my_pipeline, run_config, instance=DagsterInstance.get())

which gets a little hairy if one wanted to put said function in their test suite to ensure their production runs config was expected to be valid. My assumption here is that generally the test environment will not be setup with the same instance. There are a few different ways to deal with this so it's not a hard blocker - just an awkward case that makes me feel like this isn't the cleanest modeling of this complexity.

Im slowly warming up to this idea - even though its pretty subtle it feels like a big change in the system to me

alangenfeld added a subscriber: sandyryza.

@sandyryza brought up an interesting kernel of an idea about just making the execution block of config Permissive - at least from the pipeline perspective.
So you could drop threading the default executors to the snapshot and env configs and let those just have empty executors and default to a Permissive section. Then only at the "real" execution call sites resolve the instance executors and maybe do a separate config validation/processing pass for the execution subsection

heres a quick prototype to illustrate the idea from previous comment https://dagster.phacility.com/differential/diff/36671/

I can spend some time fleshing this out since I'm being picky about how this works.

the big thing you use there is config playground typeahead support in dagit right?

the big thing you lose there is config playground typeahead support in dagit right?

The information is still available since dagit operates in the context of the instance so depending how the implementation shakes out - the amount of extra BS required to keep that feature will fluctuate yeah. Good thing to call out.

right, so we'd need to also separate out the object that Dagit operates against to generate the config schema to be separate from just a field on the pipeline snapshot (and at that stage we could bring in the instance) - I don't have a great sense into how messy that would be

alangenfeld edited reviewers, added: dgibson; removed: alangenfeld.

ill put this in my queue for now accordingly