Page MenuHomePhabricator

POC of retryable exception handling in Celery engine
Needs RevisionPublic

Authored by max on Dec 19 2019, 2:16 AM.

Details

Summary

Solids may be tagged with exception classes on which they should retry.

Test Plan

Unit

Diff Detail

Repository
R1 dagster
Branch
arcpatch-D1707
Lint
Lint OK
Unit
No Unit Test Coverage

Event Timeline

max created this revision.Dec 19 2019, 2:16 AM
max updated this revision to Diff 7989.Dec 19 2019, 3:07 AM

Rebase

max updated this revision to Diff 8003.Dec 19 2019, 3:43 PM

Fix types

max updated this revision to Diff 8004.Dec 19 2019, 4:28 PM

Fix tests

schrockn requested changes to this revision.Dec 19 2019, 4:47 PM
schrockn added inline comments.
python_modules/dagster-celery/dagster_celery/engine.py
103

would be nice to replace this with https://github.com/dagster-io/dagster/issues/2014 at some point

python_modules/dagster-celery/dagster_celery/tasks.py
48

do we want to standardize on using parens in these cases? I don't have a strong opinion but would be nice to be consistent in the codebase

python_modules/dagster-celery/dagster_celery_tests/test_retry.py
94

what is behavior exactly of throwing a non retry_on exception? add test?

This revision now requires changes to proceed.Dec 19 2019, 4:47 PM
max updated this revision to Diff 8044.Dec 19 2019, 9:20 PM

Tests

python_modules/dagster-celery/dagster_celery/tasks.py
48

i'm indifferent, i added them here only because i think it's a little clearer with the _

python_modules/dagster-celery/dagster_celery_tests/test_retry.py
94

yep

max updated this revision to Diff 8056.Dec 19 2019, 10:27 PM

Rebase

schrockn added inline comments.Dec 19 2019, 10:51 PM
python_modules/dagster-celery/dagster_celery_tests/test_retry.py
94

is there a way to specify retry on *any* exception?

nate added a comment.Dec 19 2019, 11:08 PM

Sweet, thanks for starting this! My one request here is that it would be nice for retry solids to specify the number of retries and the retry strategy (e.g. retry sequentially vs. exponential backoff vs. uniform sleep)

Generally, I think it's less likely to want to vary the retry behavior with respect specific exceptions, but I'm up for convincing otherwise if you think that's important?

schrockn requested changes to this revision.Dec 21 2019, 4:35 PM
  • Requesting changes because I think we need to do some first-principles thinking on what message we want to emit during a retry run and what they mean.
python_modules/dagster-celery/dagster_celery/engine.py
96

So is relationship here that we are going to emit this message and our step failure message for a failure on celery?

python_modules/dagster-celery/dagster_celery/tasks.py
51

I'm not sure that we should be creating a message with the ERROR setting on a retry. This puts us in a new position where we can have a step emit an ERROR message but still successfully complete.

This revision now requires changes to proceed.Dec 21 2019, 4:35 PM