This drops the bespoke file handling that Dagstermill previously used
for output notebooks and for passing outputs back to solid code from the
Jupyter process, instead using the file_manager and object_manager facilities.
Documentation changes will follow after review of this basic approach.
Details
- Reviewers
sandyryza yuhan schrockn alangenfeld prha
Unit
Diff Detail
- Repository
- R1 dagster
- Branch
- dagstermill-file-handling
- Lint
Lint OK - Unit
No Unit Test Coverage
Event Timeline
examples/airline_demo/airline_demo/solids.py | ||
---|---|---|
542–544 | nit: might be good to capture this input def pattern with a function that just takes the input name | |
python_modules/dagster/dagster/core/execution/plan/execute_step.py | ||
236–238 | clean dis | |
python_modules/libraries/dagstermill/dagstermill/examples/repository.py | ||
202–214 | lol | |
python_modules/libraries/dagstermill/dagstermill/manager.py | ||
192–194 | add a test with fan-in inputs, you may need to unpack a list of these i think | |
320–321 | this is in the context of generating a set of dynamic outputs which would be happening in one notebook, so i don't think collision is an issue. Will need to figure out how to support dynamic outputs from a notebook with this yield_result API or another at somepoint | |
python_modules/libraries/dagstermill/dagstermill/solids.py | ||
233–334 | this is real gnarly, is it feasible to decompose this at all? Some narrative about the different steps happening would be good too. It was particularly confusing to see these core abstractions re-used here without context since at first glance it seems like it might be a custom alternate path to the core execution flow as opposed to just re-using abstractions in a slightly different context |
python_modules/libraries/dagstermill/dagstermill/manager.py | ||
---|---|---|
314 | Nice |
@alangenfeld I've reorganized and broken up the solid compute function logic to be more readable, and added support for fan-in inputs
some complex stuff but i think the comments and cleanup make it parseable at least, thanks for that.
python_modules/dagster/dagster/core/execution/plan/execute_step.py | ||
---|---|---|
260–380 | coordinate with @yuhan on these changes since she has conflicting diff | |
python_modules/libraries/dagstermill/dagstermill/solids.py | ||
229 | whats this about? |
python_modules/libraries/dagstermill/dagstermill/solids.py | ||
---|---|---|
229 | sigh, yes. i don't even know what to put in a comment because this is so weird but. >>> def f(): ... return ... >>> def g(): ... yield from f() ... >>> [x for x in g()] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> File "<stdin>", line 2, in g TypeError: 'NoneType' object is not iterable but >>> def f(): ... return ... yield ... >>> def g(): ... yield from f() ... >>> [x for x in g()] [] |
python_modules/libraries/dagstermill/dagstermill/solids.py | ||
---|---|---|
229 | but this function has other yield which i assume would make it work, but you found it to be otherwise? |
python_modules/libraries/dagstermill/dagstermill/examples/repository.py | ||
---|---|---|
214 | haha there has to be a term of art for a FIXME to TODO conversion | |
python_modules/libraries/dagstermill/dagstermill/solids.py | ||
229 | i believe you should be able to emit the return entirely. The presence of a yield statement makes the function an iterator, even if the yield is never executed. def f(): return yield returns an empty iterator whereas def g(): return Is just a normal function that returns None |
feels like we have broad agreement on this approach, going to abandon this diff and open a stack