Page MenuHomeElementl

[op] permit op config entry
ClosedPublic

Authored by cdecarolis on Jun 25 2021, 11:48 PM.

Details

Summary

Not sure if we want to support both ops and solids config in the same config document, or if we should enforce one or the other to avoid confusion.

Test Plan

expanded unit tests

Diff Detail

Repository
R1 dagster
Branch
config_entry_op
Lint
Lint Passed
Unit
No Test Coverage

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

other cases that would need to get covered

  • config_mapping
  • ops: inside of a composition graph's config section
  • default_config
python_modules/dagster/dagster/core/system_config/objects.py
178 ↗(On Diff #40271)

*

275–292 ↗(On Diff #40271)

*

This revision now requires changes to proceed.Jun 28 2021, 3:01 PM

Handle config mapping and default config case, clean up implementation

i kind of hate it for reasons i cant express - but i think its the right impl if we want to support this

the error paths in the config errors being out of sync is going to feel pretty rough

Going to defer to Alex on this one

add test with permissive edge cases

I am pretty sure this current approach isn't going to work. We may need to add custom support to the config system to handle this flexibility only at the nodes in the config tree where we should

python_modules/dagster/dagster/core/system_config/objects.py
280–281 ↗(On Diff #41482)

i think the existing test cases work because once "ops" is seen as a key we stop the recursion / transform. You should be able to get one of your current tests to fail by getting ops out preventing this early exit. You should also be able to create a failing test around with nested @graph structure where this early term would prevent a needed transformation

This revision now requires changes to proceed.Jul 13 2021, 8:33 PM

Implement an approach that preserves correctness - at great great cost

If we want to go back to the previous direction, we could walk / refer to the pipeline [1] structure when transforming the config.

The benefit this current direction has is providing something that we could be sent over graphql to dagit to allow the frontend to more gracefully handle these field aliases.

for what its worth I think this direction could get to a better feeling place with some iteration

python_modules/dagster/dagster/core/definitions/run_config.py
138

we could change the impl for "storage" fallback to this same scheme (if we go in this direction)

python_modules/dagster/dagster/core/system_config/objects.py
147 ↗(On Diff #41482)

[1]

If we want to go back to the previous direction, we could walk / refer to the pipeline [1] structure when transforming the config.

The benefit this current direction has is providing something that we could be sent over graphql to dagit to allow the frontend to more gracefully handle these field aliases.

for what its worth I think this direction could get to a better feeling place with some iteration

If you think we can get to a better feeling place with it, I'm down to explore that. Hadn't considered the graphql/dagit benefits. Given that I guess an implementation such as this can generally help with backcompat for config + has a real use case, it makes me feel a bit less fatalistic. It definitely is a ton of machinery, though.

Did you have any specific ideas for iteration?

Update yet more snapshot tests

examples/memoized_development/tests/test_memoized_development.py
4–22 ↗(On Diff #41891)

oops- wrong diff

  • I think field_aliases is better than field_substitutions since we are just creating a second name that the same information is available under
    • related - I think you want to just swap the order since you are flipping it in like all the call sites
  • use this approach to handle storage compat
  • add tests at the Shape level in the config package
  • add tests for conflicts (error I assume)
  • take a minute and look at the context around each change to verify the impl feels right. I think the dict get with default trick you used was good to get it working, but there may be cleaner approaches.
python_modules/dagster/dagster/config/validate.py
208–229

for example here defined_field_names should just consist of the full set of possible names instead of threading through the aliases

unless you update this to detect conflicts then it makes more sense to pass it

241–255

here i think a small amount of code redundancy leads to something more readable

this is personal, but i think its worth trying to take a second pass

for field_snap in context.config_type_snap.fields:
    name = field_snap.name
    if name in config_value:
        field_evr = _validate_config(context.for_field_snap(field_snap), config_value[name])
        if field_evr.errors:
            field_errors += field_evr.errors

   # check for backcompat field name 
   elif name in field_aliases and field_aliases[name] in config_value:
        field_evr = _validate_config(context.for_field_snap(field_snap), config_value[field_alias[name]])
        if field_evr.errors:
            field_errors += field_evr.errors
This revision now requires changes to proceed.Jul 20 2021, 5:45 PM
python_modules/dagster/dagster/config/validate.py
208

nit: just use the default-ed empty dict that came from opt_dict_param to avoid having to case against it

Update implementation and addressed most comments

Give implementations another pass. Finish addressing comments

Fix issue with config post-processing

python_modules/dagster/dagster/config/errors.py
23

alias collision probably

python_modules/dagster/dagster/config/field_utils.py
89–98

update docs

python_modules/dagster/dagster/config/validate.py
244–254

nit: i think this is hard to discern whats going on, more simpler if blocks may be clearer. I would say at least add comments, but thinking about what the comments would say makes be believe simpler & potentially redundant is clearer

308–309

is this right? if feel like a missing field without an alias causes this to short circuit the wrong way

expect to see some tests in python_modules/dagster/dagster_tests/core_tests/config_types_tests/ for validation, hashing, etc

Address comments: make implementations more readable, add documentation, add hashing tests, fix errors

python_modules/dagster/dagster/config/errors.py
144

mention alias in here probably "field name and its alias alised_name"

python_modules/dagster/dagster/config/field_utils.py
100–110

[2]

python_modules/dagster/dagster/core/definitions/graph.py
755–770

unrelated: should this not have an else: case to just use the original field ? probably need more tests for this

python_modules/dagster/dagster_tests/core_tests/config_types_tests/test_field_hash.py
32–64 ↗(On Diff #42039)

I dont think its wise to change these existing test cases in place - would be better to add new separate ones (they can re-use the base)

does a shape with aliases have the same identity as one without? should it? It feels odd in [2] that we ignore the aliases

round trip since there are failures any way - I think these last set of test adjustments should be it

This revision now requires changes to proceed.Jul 22 2021, 8:44 PM

Address comments.

Additionally, having trouble getting the snapshot tests to cooperate. pytest --snapshot-update does not seem to be doing the trick.

python_modules/dagster/dagster/core/definitions/graph.py
755–770

I think you're right. Spun that off into its own diff D9045

Additionally, having trouble getting the snapshot tests to cooperate. pytest --snapshot-update does not seem to be doing the trick.

This points at some bug or bad assumption in the hashing or snapshotting code around this change.

Fix snapshots (tests are passing on local)

push_n_pray

python_modules/dagster/dagster/config/snap.py
62–75

nit: get these comments in sync

This revision is now accepted and ready to land.Jul 28 2021, 5:47 PM