Page MenuHomeElementl

initial async solids support
ClosedPublic

Authored by alangenfeld on Jan 14 2021, 10:38 PM.

Details

Summary

This adds bare minimum support for

@solid
async def function_name(...):

by running async generators to completion close to where we execute the user function.

We convert the async generator to a generator by using (the compat equivalent to) asyncio.run to move the iterator forward each tick until it completes.

Test Plan

basic tests added

Diff Detail

Repository
R1 dagster
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

actual interleaving of multiple async solids in the in process executor is going to be quite gnarly

  • how do you decide what to interleave, do you only start a sync solid while waiting for async one?
  • not sure how to make compute log manager work with multiple solids happening in the same process at the same time
Harbormaster returned this revision to the author for changes because remote builds failed.Jan 14 2021, 11:08 PM
Harbormaster failed remote builds in B24462: Diff 29783!

mixing await and yield is a bit odd. is that normative in python?

I haven't followed the async stuff much - what are users able to accomplish with async in this context that they can't accomplish with normal solids?

mixing await and yield is a bit odd. is that normative in python?

https://www.python.org/dev/peps/pep-0525/ not sure how common its use is but its part of the language

I haven't followed the async stuff much - what are users able to accomplish with async in this context that they can't accomplish with normal solids?

This first step just makes it so that if a user annotates an async def function with @solid it works. In master it ends up treating the Coroutine / AsyncGenerator object that gets returned as an output and then failing in bizarre ways.

So this first diff enables cooperatives multi-tasking within the context of a single solid, which likely isn't that useful but is still a nice capability if you are using async based libraries in the body of your solids. If you for example had a single solid downloading multiple items, this would give you an easy way to have all that IO happen simultaneously.

A future diff would create an executor that enables cooperative multitasking between multiple solids. To do that though involves some tricky things:

  • disabling or changing substantially the ComputeLogManager
  • working through the scheduling choices around having multiple solids in flight with cooperative multi-tasking
    • do you disallow sync solids?

@alangenfeld thanks for the explanation - that makes sense.

I don't think I've wrapped my head around enough of the implications here to have an opinion yet. It feels like there are some weird edge cases as we fill out the matrix of executors and combinations of async and non-async solids.

It feels like there are some weird edge cases as we fill out the matrix of executors and combinations of async and non-async solids.

oh absolutely - getting the executors to natively support this will be quite a task. This first diff though just runs the solid to completion as if it were synchronous by driving an event loop for you, which i think is safe and non-controversial.

bump

I think the other aspect of this is moving incrementally towards making it possible for a user to write their own async executor, which today is totally infeasible

I support landing it. Can we add docs in the same diff (or another one concurrently?)

This revision now requires changes to proceed.Mar 17 2021, 6:44 PM

could folks speak up if they have reservations about this, or resign if you don't feel comfortable making a call on it?

sorry for sitting on this forever.

python_modules/dagster/dagster/core/definitions/decorators/solid.py
121–122

somebody might think after reading this that such an executor exists

python_modules/dagster/dagster_tests/core_tests/execution_tests/test_async.py
17–21

is it worth testing multiple outputs / solid failures? or does that go through substantially the same path

python_modules/dagster/dagster_tests/core_tests/execution_tests/test_async.py
17–21

same code path

This revision is now accepted and ready to land.Apr 6 2021, 8:35 PM
This revision was automatically updated to reflect the committed changes.