Page MenuHomePhabricator

Split DagterGraphQLContext into ProcessContext and RequestContext
ClosedPublic

Authored by sashank on Jan 13 2021, 4:40 PM.

Details

Summary

This diff splits the DagsterGraphQLContext into two separate classes:

  • ProcessContext
  • RequestContext

This split is only meaningful in the context of the Dagit webserver. The ProcessContext is long-lived, and the RequestContext is created for every request. The RequestContext stores (1) a reference to all the repository locations on the ProcessContext at the start of the request and (2) a snapshot of the Workspace state at the start of the request.

This is needed due to the introduction of the repository location watch threads that can modify the global context. The problem was that if a request was accessing a repository location at the same time the watch thread was cleaning it up, bad things would obviously happen.

Instead, the watch threads only run on the ProcessContext and are free to reload the repository locations and workspace as they like. RepositoryLocationHandles are not immediately cleaned up on location reload, and instead are cleaned up whenever all references to them in both the ProcessContext and all RequestContexts are gone.

This diff also brings back the sqlite_instance_deployed_grpc_env test suite.

Resolves https://github.com/dagster-io/dagster/issues/3404

Test Plan

unit

Diff Detail

Repository
R1 dagster
Branch
request-context-2
Lint
Lint OK
Unit
No Unit Test Coverage

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

zoom debugging / review

This revision now requires changes to proceed.Jan 14 2021, 3:54 PM
  • Update subscription server

Coming up: Test to verify cleanup

nice

python_modules/dagit/dagit/app.py
50

+ comment

python_modules/dagster-graphql/dagster_graphql/implementation/context.py
16–17

+ comment

152

+ comment

python_modules/dagster-graphql/dagster_graphql/schema/roots.py
535–536

nit: could just return the repo location that was updated? ignore if theres a good reason why not

python_modules/dagster/dagster/cli/workspace/workspace.py
10–12

+ comment

to your queue for mentioned test and additional comments

This revision now requires changes to proceed.Jan 14 2021, 5:28 PM
  • Add garbage collection tests
  • fix lint
python_modules/dagster-graphql/dagster_graphql/schema/roots.py
535–536

we need to recreate a request context because all the getters are only on the request context

does the watch thread need to get turned back on?

makeitso

This revision is now accepted and ready to land.Jan 14 2021, 6:07 PM

does the watch thread need to get turned back on?

yup just need to rebase

  • rebase and turn on watch thread
  • bring back postgres deployed test suite

A problem with the approach here highlighted by a user:

If you push an error to a remote gRPC repository location, then Dagit will automatically reload that location and display the error. However, Dagit will not heal itself once the remote gRPC location is fixed. Instead, the user is required to manually go and reload the location.

Trying out different approaches to fix this:

  • When we have an error-ed location, start a thread attempting to reconnect to the location. This is a bit weird right now because we don't have a handle for error-ed locations, so we need a "pseudo-handle" to contain the API client and watch thread
  • Consolidate errors onto the handle itself, and handles maintain the entire responsibility for surfacing errors and attempting to self-heal

Rebase on new graphql changes

sashank edited the summary of this revision. (Show Details)

hammer bk

make sure tests aren't flakey