Page MenuHomePhabricator

(dagit-reload-3) Add subscriber to GraphQL context
ClosedPublic

Authored by sashank on Wed, Nov 11, 9:06 PM.

Details

Summary

This diff adds the previously implemented gRPC server watch thread and runs it for each GrpcServerRepositoryLocationHandle.

The DagsterGraphqlContext subscribes to events from the handle, and appropriately reloads the handle on disconnected and error events, as well as pushing the event onto an rx Subject. This observable will be used to form the graphql subscription for server events.

Test Plan

unit

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Harbormaster returned this revision to the author for changes because remote builds failed.Wed, Nov 11, 9:24 PM
Harbormaster failed remote builds in B21016: Diff 25489!
Harbormaster returned this revision to the author for changes because remote builds failed.Thu, Nov 12, 8:36 PM
Harbormaster failed remote builds in B21088: Diff 25574!
Harbormaster returned this revision to the author for changes because remote builds failed.Mon, Nov 16, 8:58 AM
Harbormaster failed remote builds in B21215: Diff 25732!
sashank edited the test plan for this revision. (Show Details)
dgibson requested changes to this revision.Wed, Nov 18, 3:25 PM

nice, this is a lot more separated - I have some small mostly naming suggestions to take it slightly further so that dagster-graphql doesn't have to care about the fact that it's gRPC, and is listening to slightly more generic disconnect/reconnect/updated events?

python_modules/dagster-graphql/dagster_graphql/implementation/context.py
21–24

this is a lot better in terms of reducing grpc-specific logic in dagster-graphql.

In terms of separation of concerns though it still feels like the graphqlcontext is more aware of the specific implementation of some of its handles than it needs to be?

I think this is mostly a naming thing - but I'm wondering if this should be expressed in terms of the context listening to state changes in all of its locations, without any reference to gRPC specifically? There's actually nothing grpc-specific about how it reacts to these events - if the location goes down or indicates that it has updated , we reload it (the fact that it's the gRPC server going down specifically is an implementation detail of that specific handle - you could imagine reloading a docker container in a similar away because something docker-specific caused it to go down or update).

I think if we replaced

39–40

so if we remove the instance check and make it add_state_subscriber (with no reference to grpc) - and then most handles don't actually emit those events, this class would no longer have anything gRPC specific in it?

python_modules/dagster/dagster/core/host_representation/grpc_server_state_subscriber.py
8–11

so if we wanted to make these events less gRPC specific they could look something like

LOCATION_UPDATED

LOCATION_DISCONNECTED (or LOCATION_UNREACHABLE if you want to go even more generic, but meh)

LOCATION_ERROR

python_modules/dagster/dagster/core/host_representation/handle.py
141–149

suggestion is to put these on the base class, makes the names not gRPC specific, but have them be a no-op on most classes

This revision now requires changes to proceed.Wed, Nov 18, 3:25 PM

above suggestion could let us potentially reload dagit for managed servers as well using the same flows, if we had some way of detecting when the underlying code has changed

python_modules/dagster/dagster/core/host_representation/grpc_server_state_subscriber.py
8–11

good suggestion, will update

sashank edited the summary of this revision. (Show Details)

up

@dgibson thanks for the great suggestions, updated the diff accordingly.

above suggestion could let us potentially reload dagit for managed servers as well using the same flows, if we had some way of detecting when the underlying code has changed

you're totally right. I started exploring that here: https://dagster.phacility.com/D4923. we could have that watchdog just call on_updated on the managed handles :)

Move down re-attaching event subscriber

awesome, I think this looks really good

python_modules/dagster-graphql/dagster_graphql/implementation/context.py
4

is it clear enough that Location is referring to RepositoryLocation here?

60–61

now this can be removed rigth?

python_modules/dagster/dagster/core/host_representation/handle.py
129

Maybe drop the "In error state"? As a user I wouldn't be clear what that meant. Maybe something like "You can reload the server once it is reachable again"?

This revision is now accepted and ready to land.Fri, Nov 20, 3:37 PM

Remove add subscriber from event handler

Update error message. Will addresss rename later in the stack