Page MenuHomePhabricator

[engine] shared step iteration
ClosedPublic

Authored by alangenfeld on Jan 22 2020, 5:12 PM.

Details

Summary

Adds ActivePlan a stateful object for managing the iteration of execution steps to allow prioritized progression down any unblocked path and migrates all relevant engines to using it (dask submits all jobs up front so didn't bother there).

formalize dagster/priority centered at a default of 0 - higher numbers are higher priority, can use negative to lower priority.

Test Plan

added tests, buildkite

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

alangenfeld created this revision.Jan 22 2020, 5:12 PM
alangenfeld updated this revision to Diff 8798.Jan 22 2020, 5:37 PM

ordered dict

alangenfeld updated this revision to Diff 8799.Jan 22 2020, 6:10 PM

priority test

alangenfeld edited the summary of this revision. (Show Details)Jan 22 2020, 6:12 PM
alangenfeld updated this revision to Diff 8801.Jan 22 2020, 6:37 PM
alangenfeld edited the summary of this revision. (Show Details)

py2 fixes?

alangenfeld retitled this revision from [engine] introduce ActivePlan to [engine] shared step iteration.Jan 22 2020, 6:47 PM
alangenfeld edited the summary of this revision. (Show Details)
alangenfeld added reviewers: max, schrockn.
alangenfeld updated this revision to Diff 8804.Jan 22 2020, 7:10 PM

cant repro locally add some logging for BK

alangenfeld updated this revision to Diff 8805.Jan 22 2020, 7:34 PM

more ordered dict

max accepted this revision.Jan 22 2020, 9:51 PM
This revision is now accepted and ready to land.Jan 22 2020, 9:51 PM
max requested changes to this revision.Jan 22 2020, 10:10 PM

sounds like we are going to not try to paper over the differences in priority handling between celery, multiprocess[, k8s, ...] -- so let's revert the tagging change in the celery tests and reinstitute the sorting step in dagster-celery

This revision now requires changes to proceed.Jan 22 2020, 10:10 PM
alangenfeld updated this revision to Diff 8870.Jan 23 2020, 8:18 PM

move sorting out to individual engines. use config to drive what key to use for prioirty on default engines

schrockn requested changes to this revision.Jan 23 2020, 8:35 PM

So what I was expecting to see here is not a configurable priority key in terms of *configuring* executor on a per-run basis, but instead just having it be part of the executor definition, plus a comparator functions between priorities. Exposing a configurable priority key is actually pretty risky and will result in a lot of replicated config.

python_modules/dagster-celery/dagster_celery_tests/test_priority.py
204 ↗(On Diff #8870)

remove

python_modules/dagster/dagster/core/engine/engine_multiprocess.py
113–117

can you change all variables named active_plan to active_execution across this PR? very confusing rt now

python_modules/dagster/dagster_tests/core_tests/engine_tests/test_priorities.py
30

spelling

41

spelling

python_modules/dagster/dagster_tests/core_tests/execution_plan_tests/test_execution_plan.py
29

active_execution

This revision now requires changes to proceed.Jan 23 2020, 8:35 PM

what should the predefined keys be for the in process and multi process engines? I wasn't happy with what I came up with which motivated me to punt it out to config

priority?

So the crux of it is:

  • we do dagster/priority and it works for the core executors but (probably?) none of the others which leads to an unfortunate expectation mis-match in a not clearly communicated way (keys just ignored)
  • we do separate keys for each executor to align expectations ie dagster/in-process/priority, dagster-in-process/priority, whatever but thats aesthetically unfortunate and leads to having to set all keys to support different execution modes

Exposing a configurable priority key is actually pretty risky and will result in a lot of replicated config.

less in defense of this approach which i agree is not ideal and more for my edification - can you expand on this point? It's not obvious to me what the risk is

"risky" might be the wrong word but more like "too easy to break" meaning that anytime you change that config key it will certainly not behave the way you want to. Any change with the config would have to be coupled with a change to code. Given that it doesn't seem like something that should be in config

alangenfeld planned changes to this revision.Jan 24 2020, 12:15 AM

ill throw up a static key version here

alangenfeld updated this revision to Diff 8888.Jan 24 2020, 1:16 AM

'dagster/priority' version

alangenfeld added inline comments.Jan 24 2020, 1:19 AM
python_modules/dagster-celery/dagster_celery/engine.py
59–61

would a pipeline_context.log.warn about steps that have dagster/priority but not dagster-celery/priority be enough to qualm concerns ?

python_modules/dagster/dagster/core/definitions/executor.py
116–117

will rm

alangenfeld updated this revision to Diff 8890.Jan 24 2020, 1:48 AM

add warning

schrockn added inline comments.Jan 24 2020, 1:53 AM
python_modules/dagster-celery/dagster_celery/engine.py
56

parens unnecessary?

61

that seems reasonable for now.

python_modules/dagster/dagster/core/execution/plan/plan.py
401

abstractly sort_key_fn seems like it should be a property of the ActionExecution rather than a parameter here

alangenfeld added inline comments.Fri, Jan 24, 4:06 PM
python_modules/dagster/dagster/core/execution/plan/plan.py
401

good call

schrockn added inline comments.Fri, Jan 24, 4:37 PM
python_modules/dagster-celery/dagster_celery/engine.py
56

parens?

142

maybe recommend having the user writing a function to generate both tags

schrockn requested changes to this revision.Fri, Jan 24, 4:37 PM

q mgmt

This revision now requires changes to proceed.Fri, Jan 24, 4:37 PM
alangenfeld updated this revision to Diff 8893.Fri, Jan 24, 4:50 PM

move sort_key_fn to start(), fix bug

schrockn accepted this revision.Fri, Jan 24, 5:50 PM

👍🏻

max added inline comments.Mon, Jan 27, 9:04 PM
python_modules/dagster-celery/dagster_celery/engine.py
142

dagster/priority

max accepted this revision.Mon, Jan 27, 9:05 PM

Would love to have some human readable docs on dagster/priority just so we don't forget how it works :)

This revision is now accepted and ready to land.Mon, Jan 27, 9:05 PM
This revision was automatically updated to reflect the committed changes.