Page MenuHomePhabricator

root input manager
ClosedPublic

Authored by sandyryza on Wed, Jan 6, 5:45 PM.

Details

Summary

This diff does two related things makes it so that object managers (soon to be IO managers) are responsible for all loading of inputs that have upstream outputs.

Outstanding questions: https://elementl.quip.com/kkLXAoLIhqlX/Root-input-manager-type-loader-progress-doc

Test Plan

bk

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Harbormaster returned this revision to the author for changes because remote builds failed.Wed, Jan 6, 9:40 PM
Harbormaster failed remote builds in B23773: Diff 28910!
Harbormaster returned this revision to the author for changes because remote builds failed.Wed, Jan 6, 10:33 PM
Harbormaster failed remote builds in B23782: Diff 28922!
Harbormaster returned this revision to the author for changes because remote builds failed.Wed, Jan 6, 11:31 PM
Harbormaster failed remote builds in B23787: Diff 28928!
Harbormaster returned this revision to the author for changes because remote builds failed.Thu, Jan 7, 12:19 AM
Harbormaster failed remote builds in B23789: Diff 28930!
Harbormaster returned this revision to the author for changes because remote builds failed.Thu, Jan 7, 4:34 AM
Harbormaster failed remote builds in B23800: Diff 28938!
Harbormaster returned this revision to the author for changes because remote builds failed.Thu, Jan 7, 5:02 AM
Harbormaster failed remote builds in B23805: Diff 28944!
Harbormaster returned this revision to the author for changes because remote builds failed.Thu, Jan 7, 5:55 AM
Harbormaster failed remote builds in B23806: Diff 28945!
Harbormaster returned this revision to the author for changes because remote builds failed.Thu, Jan 7, 7:33 PM
Harbormaster failed remote builds in B23834: Diff 28976!
sandyryza added inline comments.
python_modules/dagster/dagster/core/definitions/mode.py
60 ↗(On Diff #28972)

Note: we are introducing a new reserved resource key, "root_input_manager".

python_modules/dagster/dagster/core/definitions/pipeline.py
780

We are now delegating the question "can load the input definition?" to the root input manager.

python_modules/dagster/dagster/core/execution/plan/inputs.py
123

FromStepOutput now always uses the upstream object manager.

python_modules/dagster/dagster/core/storage/input_manager.py
14 ↗(On Diff #28976)

[1] This is kind of a hack to get tests working and merits broader discussion. It relates to two slightly bigger questions:

  • How does the root input manager communicate to the framework that it doesn't know how to handle a particular input type?
  • How does falling back to default values work?
55 ↗(On Diff #28976)

See [1]

96 ↗(On Diff #28976)

TODO: consolidate these into a single argument

278 ↗(On Diff #28976)

See [1]

sandyryza retitled this revision from root input manager to RFC: root input manager.Thu, Jan 7, 8:00 PM
sandyryza edited the summary of this revision. (Show Details)

How fixed are we on the term "root"? I'm trying to recall other words that we have used or would use to describe inputs that have no upstream. I just want to make sure this is the best we have come up with.

SourceInputManager
StandaloneInputManager
UnattachedInputManager
OriginatingInputManager
SeedInputManager

My other comment is here is that we are kind of retreating from notion of being able to use solids you don't control or don't want to alter upstream of a solid you are working on. I was quite fond of that when we got there but it certainly introduced confusion on other dimensions. I just want to be explicit about the tradeoff we are making and to be sure we are comfortable with it. Namely, you cannot specify input loading behavior independent of the IOManager attached to the upstream output.

python_modules/dagster/dagster/core/storage/input_manager.py
17 ↗(On Diff #29006)

minor nit. i don't think one approach is better or worse but most of our sentinels are just classes

just doing q mgmt. any other alternative names?

This revision now requires changes to proceed.Fri, Jan 8, 12:55 PM

How fixed are we on the term "root"? I'm trying to recall other words that we have used or would use to describe inputs that have no upstream. I just want to make sure this is the best we have come up with.

Definitely not fixed. SourceInputManager sounds at least as good to me. Some more ideas:

  • UnconnectedInputManager
  • NoDependencyInputManager

My other comment is here is that we are kind of retreating from notion of being able to use solids you don't control or don't want to alter upstream of a solid you are working on. I was quite fond of that when we got there but it certainly introduced confusion on other dimensions. I just want to be explicit about the tradeoff we are making and to be sure we are comfortable with it. Namely, you cannot specify input loading behavior independent of the IOManager attached to the upstream output.

I think "you cannot specify input loading behavior independent of the IOManager attached to the upstream output" is definitely accurate. I do think there is a difference between "you cannot specify input loading behavior independent of the IOManager attached to the upstream output" and "retreating from notion of being able to use solids you don't control". Solids only specify their manager keys, not their manager implementations. The pipeline definer ultimately decides how intermediates are stored/loaded.

sandyryza retitled this revision from RFC: root input manager to root input manager.Fri, Jan 8, 6:36 PM
sandyryza edited the summary of this revision. (Show Details)

based on our earlier conversation it seems we may punt on this for release due to the complexity around:

  • type based required resource keys
  • interactions with default value loading

leading us to believe we may not have this totally dialed in yet

This revision now requires changes to proceed.Fri, Jan 8, 10:45 PM
This revision was not accepted when it landed; it landed in state Needs Review.Tue, Jan 12, 6:03 PM
Closed by commit R1:ae09fcdb1a03: root input manager (authored by sandyryza). · Explain Why
This revision was automatically updated to reflect the committed changes.