Page MenuHomePhabricator

Add strict mode to pipelines and only initialized required resources
Needs RevisionPublic

Authored by schrockn on Tue, Nov 26, 10:23 PM.

Details

Summary

The description for the resources_strict_mode boolean flag
on PipelineDefinition describes this behavior.

resources_strict_mode (Optional[Bool]): Defaults to false. If true this uses to
    required_resource_keys to provide a Resources objects in the context with *only*
    those keys. If the user does not specify required_resource_keys in a solid in a
    pipeline in strict mode, no resources are provided to the context. This mode
    allows us to be more selective about what resources are spun up for different
    pipeline subset executions. For example, without this mode a pipeline with
    an expensive resource (such as a Spark context) will spin up the context
    even if the current pipeline or execution plan subset does not require it. In
    strict mode, all solids are forced to declare their resource dependencies
    and therefore only those resources will be booted up.

    This boolean is meant to be temporary until all tests are migrated. We
    want all solids to have this strict behavior.

I would propose that the goal would be to have this strict mode actually
be the only behavior. This would add friction to the first usage of
resources. However it is really intuitive the moment you start doing
any out-of-process execution or pipeline subselection.

One thing this does not currently cover are types and plugins within
types that require resources. Once overall approach here is vetted
can add that.

Depends on D1501

Resolves https://github.com/dagster-io/dagster/issues/1937

Test Plan

BK

Diff Detail

Repository
R1 dagster
Branch
scope-resource-creation
Lint
Lint OK
Unit
No Unit Test Coverage

Event Timeline

schrockn created this revision.Tue, Nov 26, 10:23 PM
schrockn updated this revision to Diff 6956.Wed, Nov 27, 9:41 PM
schrockn edited the summary of this revision. (Show Details)

upmessage

schrockn updated this revision to Diff 6957.Wed, Nov 27, 9:42 PM

upmessage

Harbormaster failed remote builds in B5615: Diff 6958!
Harbormaster failed remote builds in B5614: Diff 6957!
schrockn updated this revision to Diff 6972.Wed, Nov 27, 11:39 PM
schrockn edited the summary of this revision. (Show Details)

upmessage

schrockn updated this revision to Diff 6984.Thu, Nov 28, 12:47 AM
schrockn retitled this revision from Only boot up required resources to Add strict mod to pipelines and only initialized required resources.
schrockn edited the summary of this revision. (Show Details)
schrockn added reviewers: alangenfeld, max, nate.

upmessage

schrockn updated this revision to Diff 6986.Thu, Nov 28, 12:52 AM
schrockn edited the summary of this revision. (Show Details)

upmessage

nate accepted this revision.Sun, Dec 1, 4:46 PM
nate added inline comments.
python_modules/dagster/dagster/core/definitions/system_storage.py
74

I doubt anyone is using system_storage in the wild, but note this is a small breaking external API change

python_modules/dagster/dagster/core/execution/context_creation_pipeline.py
117

in the short term, it would be nice to be able to set an environment variable like DAGSTER_RESOURCES_STRICT_MODE to override this; that would make it pretty easy to burn down the list of tests and ensure all resource requirements are specified

python_modules/dagster/dagster_tests/core_tests/test_strict_resources.py
274

maybe worth also adding a test raising an error when a solid tries to consume a resource it didn't specify as a requirement?

This revision is now accepted and ready to land.Sun, Dec 1, 4:46 PM

Migrating the test suite is a pretty modest change: https://dagster.phacility.com/D1524

schrockn marked an inline comment as done.Mon, Dec 2, 10:25 PM
schrockn added inline comments.
python_modules/dagster/dagster_tests/core_tests/test_strict_resources.py
274

test_filter_out_resources does not. currently we don't raise a special error for that situation. I could change it so that the generated class has a special method that gives a nicer error message in that case.

schrockn added inline comments.Mon, Dec 2, 10:35 PM
python_modules/dagster/dagster_tests/core_tests/test_strict_resources.py
274

test_filter_out_resources does this test, i meant

schrockn retitled this revision from Add strict mod to pipelines and only initialized required resources to Add strict mode to pipelines and only initialized required resources.Mon, Dec 2, 10:40 PM
schrockn edited the summary of this revision. (Show Details)

should there be a test for multiprocess engine / execute plan APIs that ensures unused resources are not instantiated?

python_modules/dagster/dagster/core/definitions/decorators.py
820

still feels off to me to spec this here - feels like it at least belongs on mode next to the other resource information

python_modules/dagster/dagster/core/definitions/solid.py
390–391

if I add a new solid to a composite and don't specify its requirements (where previously all of them did) what failure case do i hit?

python_modules/dagster/dagster/core/definitions/system_storage.py
74

if were going to land this diff we should note it in changes.md so we dont forget

python_modules/dagster/dagster/core/execution/api.py
194 ↗(On Diff #6986)

this is a little confusing since its just the non-composite solid names [1]

python_modules/dagster/dagster/core/execution/context_creation_pipeline.py
146

this argument feels really weird compared to the others

147

doubly odd since its very specific information that is just getting passed in to this sibling argument

python_modules/dagster/dagster/core/execution/plan/plan.py
319–320 ↗(On Diff #6986)

[1]

Do we have an exact game plan for all this yet? I would push for deferring landing anything and trying to jump right to the end state unless there is a clear user need to satisfy.

👍🏻 @alangenfeld agree let's chat when you are in tmrw

alangenfeld requested changes to this revision.Tue, Dec 3, 9:03 PM

from in person discussion:

  • move strict mode arg to mode (maybe rename it to describe the behavior of selective instantiation it turns on)
  • remove required_resource_keys from composite solid and resolve that change up through the graphql layer
  • explore cleaner options for providing the information currently handled by solid_def_names_in_execution
This revision now requires changes to proceed.Tue, Dec 3, 9:03 PM

@alangenfeld I've been trying to isolate what bothers me about attaching this behavior to mode. I think it is the following scenario.

You have two modes, one strict, one not.

The non-strict mode is used for local testing.

@solid
def something(context):
   context.resources.some_resource.it_works()

If you run this in a the non-strict local testing model it works fine as the resources are unconditionally created. However in strict mode some_resource would not be created and this error would only be caught at runtime. Seems pretty wrong to me. I think having this behavior attached to Mode actually allows for too much flexibility and can get a user into quite the confusing situation.

alangenfeld requested changes to this revision.Tue, Dec 10, 10:25 PM

thats fair, though now that we're on the 0.7.0 train should we just skip the resources_strict_mode flag and just go for it?

thoughts on [A] & [B] from previous comments still hold

python_modules/dagster/dagster/core/definitions/solid.py
386–394

[A]

python_modules/dagster/dagster/core/execution/context_creation_pipeline.py
146

[B]

This revision now requires changes to proceed.Tue, Dec 10, 10:25 PM