Page MenuHomePhabricator

StepInput refactor
ClosedPublic

Authored by alangenfeld on Fri, Nov 20, 4:19 PM.

Details

Summary

This diff refactors StepInput 's current modeling of input sources since it has gotten pretty messy and isn't able to support fan-in + input mapping. A subsequent diff will add and test the support for fanning in input mappings.

This introduces StepInputSource as a marker class for an input union of sources that express how a step input should be loaded. The current set up forces call sites to instanceof check against these types do decide how to process them. This is a little cumbersome/verbose but i think increases the likely hood that these callsites will handle all the different cases correctly, or at least have a useful error in the final else case.

Test Plan

passes existing test suite

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Harbormaster returned this revision to the author for changes because remote builds failed.Fri, Nov 20, 4:33 PM
Harbormaster failed remote builds in B21479: Diff 26061!
cdecarolis added inline comments.
python_modules/dagster/dagster/core/execution/resolve_versions.py
70

I think this code represents a change in the way step versions are computed: a single step output would previously be hashed twice (only once now due to the improved fan in logic). I think this represents a positive change, it will just need to be propogated down the chain.

Harbormaster returned this revision to the author for changes because remote builds failed.Fri, Nov 20, 6:14 PM
Harbormaster failed remote builds in B21487: Diff 26070!
Harbormaster failed remote builds in B21488: Diff 26071!
Harbormaster returned this revision to the author for changes because remote builds failed.Fri, Nov 20, 10:40 PM
Harbormaster failed remote builds in B21517: Diff 26110!
alangenfeld edited the test plan for this revision. (Show Details)

execution plan implementation is a bit out of my wheelhouse currently. someday i hope to be able to competently review this

This seems better than before

python_modules/dagster/dagster/core/definitions/solid_container.py
259

Is there some reasonable way to push this down to the push this down to the dependency definition to avoid the instanceof?

263

Should these be f-strings?

python_modules/dagster/dagster/core/execution/plan/inputs.py
60

We can rip this address stuff out.

73

Maybe this should be "dependency_step_keys" or something? Having "step_key" in there makes it clear that the returned values are step keys.

77

Would be nice to somehow include output in this name so it's obvious that they're step output handles.

python_modules/dagster/dagster/core/execution/plan/plan.py
470

Are we able to be more specific than "updated"? It's difficult for me to figure out what the goal of this is.

python_modules/dagster/dagster/core/definitions/solid_container.py
259

oh good catch is_multi exists for this one

263

๐Ÿ‘ old habits die hard

python_modules/dagster/dagster/core/execution/plan/inputs.py
60

ya? should i just delete all the tests that break?

73โ€“78

sounds good

python_modules/dagster/dagster/core/execution/plan/plan.py
470

this goes away if i can indeed delete all the address code paths and tests

python_modules/dagster/dagster/core/execution/plan/inputs.py
60

go for it

My big question is why is this FromMultipleSources and not FromMultipleStepOutputs?

python_modules/dagster/dagster/core/execution/plan/execute_step.py
519โ€“521

this is worth a comment

537

does the word sigil really make sense here? do you mean Sentinel?

python_modules/dagster/dagster/core/execution/plan/inputs.py
63โ€“70

what are the cases where this isn't just a List[FromStepOutput]? this is my big question in this diff i think

This revision now requires changes to proceed.Wed, Nov 25, 6:00 PM

My big question is why is this FromMultipleSources and not FromMultipleStepOutputs?

see inline comment, also heres a picture

python_modules/dagster/dagster/core/execution/plan/execute_step.py
537

๐Ÿ‘

python_modules/dagster/dagster/core/execution/plan/inputs.py
63โ€“70

the input mapping case is where you can get non-step output entries in the source list, this is in the diff above this one in the stack D5268

@schrockn im going to land this to allow @sandyryza to build on top of it. I'm pretty sure my answer to your question should clear, if you have any other feedback i can incorporate it in to D5268

This revision is now accepted and ready to land.Wed, Nov 25, 8:14 PM
This revision was landed with ongoing or failed builds.Wed, Nov 25, 8:20 PM
Closed by commit R1:905fd981bcf1: StepInput refactor (authored by alangenfeld). ยท Explain Why
This revision was automatically updated to reflect the committed changes.