Page MenuHomePhabricator

(dagit-reload-2/n) gRPC server watch thread
ClosedPublic

Authored by sashank on Wed, Nov 11, 9:05 PM.

Details

Summary

This diff introduces a utility function to create a thread that tracks the state of a running gRPC server. This utility function will eventually be used by the DagsterGraphqlContext in order to appropriately reload handles and inform the Dagit front-end client of updates.

Test Plan

unit tests to check that the correct callback functions are called in all the possible server lifecycles

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Harbormaster returned this revision to the author for changes because remote builds failed.Wed, Nov 11, 9:21 PM
Harbormaster failed remote builds in B21014: Diff 25487!
sashank edited the summary of this revision. (Show Details)

bump

Harbormaster returned this revision to the author for changes because remote builds failed.Thu, Nov 12, 8:34 PM
Harbormaster failed remote builds in B21087: Diff 25573!
Harbormaster returned this revision to the author for changes because remote builds failed.Fri, Nov 13, 7:41 PM
Harbormaster failed remote builds in B21150: Diff 25657!
Harbormaster returned this revision to the author for changes because remote builds failed.Mon, Nov 16, 8:55 AM
Harbormaster failed remote builds in B21214: Diff 25731!

nice - thanks for the thorough testing. the only question I have is if we can get away with not introducing a new API just for the test (see inline)? Maybe there's a sublety there that I'm not seeing

python_modules/dagster/dagster/grpc/server_watcher.py
45–46

i'm confused about the expectation here in the on_error case - if it can't connect what new thread would it be creating?

71

instead of time.sleep use

shutdown_event.wait(watch_interval)? Then it wakes up right away if the shutdown event is set

(but then you still need to check the event)

See https://dagster.phacility.com/source/dagster/browse/master/python_modules/dagster/dagster/grpc/client.py$37-39 for an example

87–88

same as above

110

return can be removed?

python_modules/dagster/dagster_tests/general_tests/grpc_tests/test_ping.py
153

what's the use case for the fixed server ID? I imagine that will become clear later in the stack?

python_modules/dagster/dagster_tests/general_tests/grpc_tests/test_watch_server.py
41

i'm confused why we need a new API just for this test, vs. terminating the process and starting a new one on the same port (as an actual re-deploy would do)? (with a long enough watch interval / number of retries that it won't timeout while the new server is starting up?)

This revision now requires changes to proceed.Tue, Nov 17, 10:41 PM
python_modules/dagster/dagster/grpc/server_watcher.py
45–46

this was a bit confusing, clarified the comment. the thread shuts down completely in the on_error and on_updated cases. in the next diff, the graphql context handles creating a new handle and subscribing to the events on the new handle

71

updated

87–88

updated

110

this is to break out of the while True and end the thread

python_modules/dagster/dagster_tests/general_tests/grpc_tests/test_watch_server.py
41

you're totally right, removed this

  • updated grpc server watch thread comment
  • use wait instead of time.sleep
  • remove UpdateServerId call
python_modules/dagster/dagster_tests/general_tests/grpc_tests/test_ping.py
153

it's to support the tests cases below to simulate intermittent server failures

dgibson added inline comments.
python_modules/dagster/dagster/grpc/server_watcher.py
69

i'm used to the .wait() going before the .is_set(), might make it more clear what's going on? but meh

110

oh duh, not sure how i missed that

python_modules/dagster/dagster_tests/general_tests/grpc_tests/test_watch_server.py
52

could add a assert not called after the first interrupt just to verify that it was specifically called in the updated part

69

could raise an exception here too

This revision is now accepted and ready to land.Wed, Nov 18, 11:06 PM
python_modules/dagster/dagster/grpc/server_watcher.py
69

yeah it usually is like that, but in this case I wanted the query to happen immediately without waiting for a second

This revision was automatically updated to reflect the committed changes.