alangenfeld nate schrockn
- Group Reviewers
- R1:c05c9c658a90: Multiprocessing and Dask execution through Dagit
Execute the sleepy_pipeline from dagit with config as follows:
storage: filesystem: execution: multiprocess: config: max_concurrent: 4 solids: giver: config: - 1 - 5 - 10 - 5
Execute the hammer_pipeline from Dagit with config as follows:
storage: filesystem: execution: dask:
(logs do not stream, but progress is visible on command line)
this is fantastic, thanks for taking this on! will let @alangenfeld take a look also but lgtm
meant to remove these commented lines?
maybe we just (by default) derive the name from the name of the function vs. specifying explicitly?
End previous string with a space so that additional_message doesnt have to prefix with it
we should try to be consistent with how definitions are set up - i think @sashank changed most to use _ prefixed internal vars and defined @properties to keep them "read only"
what exactly is the purpose of this class? name is vague and without context it seems like ExecutorConfig could flow through without this wrapper
hmm - wonder how often some of these we will be the same for all modes
This is excellent. I'll leave it to @alangenfeld to approve once this feedback is addressed.
why not just execute_pipeline here? seems the same
yeah what alex said
yeah, it's tricky and i'll add a comment. execute_pipeline will insert spurious pipeline start and pipeline success or pipeline failure event s into the stream.
yes, it certainly could -- i am really cargo culting here
I am tempted to make this an append-only thing with built-in defaults, as for logging. See the infelicity at, e.g. hammer.py l. 27
i think we should generally keep these in sorted order as sed/vi/unix would do it
some comments could be nice
you can get rid of parens and make this a single string
Can you file an issue for this or make the action item otherwise more obvious? Or shall I say, less gnomic? :-)