Page MenuHomePhabricator

[celery-k8s] recover on celery task restart
ClosedPublic

Authored by alangenfeld on Oct 22 2020, 2:58 PM.

Details

Reviewers
catherinewu
Group Reviewers
Restricted Project
Commits
R1:90738e84c984: [celery-k8s] recover on celery task restart
Summary

From an observed log - if the celery workers are for example scaled down mid-task, the celery task is retried and we hit this error of the k8s job we intend to produce already existing.

Instead of exiting, if we just proceed we should gracefully recover.

The risk here is that the job name isn't the one we expect, but im not sure what conditions that could occur.

Test Plan

bk

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

johann added inline comments.
python_modules/libraries/dagster-celery-k8s/dagster_celery_k8s/executor.py
427

conflicts w/ comment

This revision is now accepted and ready to land.Mon, Nov 2, 3:36 PM