HomeElementl

Fixes for errors when a daemon thread goes down

Description

Fixes for errors when a daemon thread goes down

Summary:
Right now there are two checks for if a daemon has gone bad. The first looks for dead threads, and the second looks for missing heartbeats. The second one requires a much longer time to wait, but if a thread dies we can shut down the daemon process much sooner. This diff makes those checks happen on different intervals.

Also add a guard around the heartbeat add function - before, a transient heartbeat add failure would bring down the whole thread, now we log an error (the process will still shut down eventually if the heartbeat is permanently down, it will just take longer).

Reviewers; johann, alangenfeld

Test Plan: BK

Reviewers: johann, alangenfeld

Reviewed By: alangenfeld

Differential Revision: https://dagster.phacility.com/D7285

Details

Provenance
dgibsonAuthored on Apr 2 2021, 7:34 PM
Reviewer
alangenfeld
Differential Revision
D7285: Fixes for errors when a daemon thread goes down
Parents
R1:9a34a487e6c3: Don't rely on COMPOSE_PROJECT_IMAGE to set the pipeline code image in theā€¦
Branches
Unknown
Tags
Unknown