Page MenuHomePhabricator

Implement task-level retries
AbandonedPublic

Authored by nate on Fri, Nov 22, 5:31 AM.

Details

Reviewers
alangenfeld
Summary

somewhat of an RFC - given that Tamas requested this, wanted to bring back this diff and discuss overall approach here

Test Plan

unit

Diff Detail

Repository
R1 dagster
Branch
task_retries
Lint
Lint OK
Unit
No Unit Test Coverage

Event Timeline

nate created this revision.Fri, Nov 22, 5:31 AM
alangenfeld added inline comments.Fri, Nov 22, 4:16 PM
python_modules/dagster/dagster/core/engine/retries.py
12–62

hm - I feel like to get the stream of events we want we won't get away with just trying to wrap. I think we'll need to target https://dagster.phacility.com/source/dagster/browse/master/python_modules/dagster/dagster/core/engine/engine_inprocess.py$480-540 or shuffle that code around.

My approximation is we'll want

  1. start
  2. inputs
  3. (user events)
  4. restart - exception attached
  5. (user events)
  6. outputs
  7. success/fail
python_modules/dagster/dagster_tests/core_tests/engine_tests/test_retries.py
28

unrelated: having both metadata and step_metadata_fn on solid is odd - need to align the names or drop one

nate planned changes to this revision.Fri, Nov 22, 8:58 PM

cool thx, that sounds reasonable to me, will take another pass

nate abandoned this revision.Sat, Nov 30, 6:38 PM