Page MenuHomePhabricator

introduce StepKey class
AbandonedPublic

Authored by alangenfeld on Dec 1 2020, 10:14 PM.

Details

Summary

To pave the way for more complex step keys, create a StepKey class to use for managing step keys in internal strucutres.

StepKey is intentionally not whiteslitsted for serdes - and we switch to the string form whenever we leave the core execution abstractions.

Test Plan

buildkite

integration

Diff Detail

Repository
R1 dagster
Branch
step-key (branched from master)
Lint
Lint Warnings
SeverityLocationCodeMessage
Warningpython_modules/dagster/dagster/core/execution/context/system.py:18W0611Unused Import
Unit
No Unit Test Coverage

Event Timeline

Harbormaster returned this revision to the author for changes because remote builds failed.Dec 1 2020, 10:31 PM
Harbormaster failed remote builds in B21994: Diff 26704!
Harbormaster returned this revision to the author for changes because remote builds failed.Dec 3 2020, 8:32 PM
Harbormaster failed remote builds in B22171: Diff 26932!
Harbormaster returned this revision to the author for changes because remote builds failed.Dec 4 2020, 12:13 AM
Harbormaster failed remote builds in B22181: Diff 26942!
schrockn requested changes to this revision.Dec 4 2020, 5:23 PM

my question re strs:

i'm also wary of having step_key represent strings *and* StepKeys

Idea:

  1. s/StepKey/StepHandle and name new variables step_handle
  2. Optional additional step: come up name for string representation (step_keystr). codemod all step_key to step_keystr. then codemode all step_handle to step_key
python_modules/dagster/dagster/core/events/__init__.py
208

we've done this before (with SolidHandle) but I wasn't really a fan then. I'm wary of doing str casts to do *meaningful* conversions to string. I see a str cast and usually think "ok we are casting to str to spew some arbitrary string to the user". Would rather see explicit method. Feel free to push back

226

should we name variables differently that are strings versus the canonical object representation?

This revision now requires changes to proceed.Dec 4 2020, 5:23 PM

i'm also wary of having step_key represent strings *and* StepKeys

Ya this is definitely the main problem with the diff as it stands. I was scared of the StepHandle approach when i started, but it's a bit less scary now that I have a better sense of where the string / object boundaries lie. There is going to be a lot of code and tests (that can't just be code-modded) that needs to get updated to step_handle. I think I just have to try it to figure out how gnarly it is.

python_modules/dagster/dagster/core/events/__init__.py
208

one problem with that approach is having to update all the error message building callsites, which wont fail outright but will start printing the tuple. I think the only way to find & fix those is grep / manual reading

this likely plays different if we go down the StepHandle path and key returns the string version

Your discretion at how to proceed here. I just wanted to register my concern here. Understand there are nasty tradeoffs here and the need to make progress. We could make an issue to track dealing with this and potentially assign to a new hire or someone can pick up if they want some mindless work :-)

I think its worth the time to try the handle approach - this diff leaves the code base in a pretty confusing place. If we had type hints everywhere step_key: StepKey vs step_key: str would be a bit more manageable, but as it stands you are only protected in the places that happen to use check which is too sparse for how much the step key gets threaded around. This diff passes all the tests, but I am almost certain there are problematic callsites not under test.