Page MenuHomePhabricator

Multiprocessing and Dask execution through Dagit
ClosedPublic

Authored by max on Aug 6 2019, 10:10 PM.

Details

Summary
Test Plan

Unit, manual:

Execute the sleepy_pipeline from dagit with config as follows:

storage:
  filesystem:
execution:
  multiprocess:
    config:
      max_concurrent: 4
solids:
  giver:
    config:
      - 1
      - 5
      - 10
      - 5

Execute the hammer_pipeline from Dagit with config as follows:

storage:
  filesystem:

execution:
  dask:

(logs do not stream, but progress is visible on command line)

Diff Detail

Repository
R1 dagster
Branch
dask-execution
Lint
Lint OK
Unit
No Unit Test Coverage

Event Timeline

max retitled this revision from Wip to Multiprocessing execution through Dagit.Aug 6 2019, 11:59 PM
max edited the test plan for this revision. (Show Details)
max retitled this revision from Multiprocessing execution through Dagit to Multiprocessing and Dask execution through Dagit.Aug 7 2019, 5:39 PM
max edited the test plan for this revision. (Show Details)
max added reviewers: alangenfeld, nate, schrockn, Restricted Project.
max edited the summary of this revision. (Show Details)

this is fantastic, thanks for taking this on! will let @alangenfeld take a look also but lgtm

python_modules/dagster-graphql/dagster_graphql/implementation/pipeline_execution_manager.py
334–338

meant to remove these commented lines?

python_modules/dagster-graphql/dagster_graphql/implementation/pipeline_run_storage.py
138

print

python_modules/dagster/dagster/core/definitions/executor.py
73

maybe we just (by default) derive the name from the name of the function vs. specifying explicitly?

python_modules/dagster-graphql/dagster_graphql/implementation/pipeline_execution_manager.py
334–338

thx

python_modules/dagster-graphql/dagster_graphql/implementation/pipeline_run_storage.py
138

thx

alangenfeld added inline comments.
python_modules/dagster/dagster/check/__init__.py
29–47

End previous string with a space so that additional_message doesnt have to prefix with it

python_modules/dagster/dagster/core/definitions/executor.py
14–15

we should try to be consistent with how definitions are set up - i think @sashank changed most to use _ prefixed internal vars and defined @properties to keep them "read only"

34–36

what exactly is the purpose of this class? name is vague and without context it seems like ExecutorConfig could flow through without this wrapper

python_modules/dagster/dagster/core/definitions/mode.py
19–27

hmm - wonder how often some of these we will be the same for all modes

This is excellent. I'll leave it to @alangenfeld to approve once this feedback is addressed.

python_modules/dagster-graphql/dagster_graphql/implementation/pipeline_execution_manager.py
139

why not just execute_pipeline here? seems the same

python_modules/dagster/dagster/core/definitions/executor.py
16

yeah what alex said

python_modules/dagster-graphql/dagster_graphql/implementation/pipeline_execution_manager.py
139

yeah, it's tricky and i'll add a comment. execute_pipeline will insert spurious pipeline start and pipeline success or pipeline failure event s into the stream.

python_modules/dagster/dagster/core/definitions/executor.py
16

yep

34–36

yes, it certainly could -- i am really cargo culting here

python_modules/dagster/dagster/core/definitions/mode.py
19–27

I am tempted to make this an append-only thing with built-in defaults, as for logging. See the infelicity at, e.g. hammer.py l. 27

to your queue for the existing comments - ill take a final pass after and this should be g2g

This revision now requires changes to proceed.Aug 7 2019, 8:18 PM
max marked 5 inline comments as done.Aug 8 2019, 2:42 PM
schrockn added inline comments.
examples/dagster_examples/toys/resources.py
7

i think we should generally keep these in sorted order as sed/vi/unix would do it

python_modules/dagster-dask/dagster_dask/executor.py
10

some comments could be nice

python_modules/dagster/dagster/core/engine/init.py
13

you can get rid of parens and make this a single string

python_modules/dagster/dagster/core/execution/config.py
114–121

Can you file an issue for this or make the action item otherwise more obvious? Or shall I say, less gnomic? :-)

max marked 2 inline comments as done.

Respond to feedback

examples/dagster_examples/toys/resources.py
7

addressing this separately -- we should add some tooling

python_modules/dagster-dask/dagster_dask/executor.py
10

yep

python_modules/dagster/dagster/core/engine/init.py
13

ugh black

This revision is now accepted and ready to land.Aug 8 2019, 7:00 PM