HomeElementl

Return false if describe_tasks isn't consistent

Description

Return false if describe_tasks isn't consistent

Summary:
Even after

00d73bb346011231f3fb1c43d4f32bfdb63cebe3 and
538c27bcada05674077612eabba7c8566988495f

ECS continues to run into list index errors:

https://dagster.slack.com/archives/C01U954MEER/p1627421095083100

I haven't been able to reproduce the issue but my best guess is that we're
running into eventual consistency issues with ECS. This is consistent with these
ECS docs:

https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_RunTask.html

AWS suggests an expontential backoff of up to 5 minutes. I think that's a little
extreme for our use case - particularly because we don't want to block the
GraphQL query from resolving.

Instead, I'm changing the behavior of .can_terminate to return False if we
run into this eventual consistency. This means occassionally, truly cancellable
pipelines will show as unable to cancel. Fortunately, the value of
.can_terminate isn't memoized so it won't be stuck as uncancellable for the
entire lifetime of the pipeline run.

Test Plan: unit

Reviewers: alangenfeld, dgibson

Reviewed By: dgibson

Differential Revision: https://dagster.phacility.com/D9105

Details

Provenance
jordansandersAuthored on Jul 27 2021, 10:39 PM
Reviewer
dgibson
Differential Revision
D9105: Return false if describe_tasks isn't consistent
Parents
R1:58983f8a4436: [dagit] TS 4.3.5
Branches
Unknown
Tags
Unknown