Page MenuHomePhabricator

RFC: rough sketch of compute function retries

Authored by nate on Sep 11 2019, 1:12 AM.



Not for check-in - this is a rough sketch of how I'm thinking of implementing compute function retry logic.

I would need to plumb through the configuration (e.g. max retries but also time delay / exponential backoff approach) - wanted to get feedback on doing things this way before proceeding.

I think we probably have to implement the retry semantics at the bottom of the stack here vs. at the top level where we materialize events from the nested generators, but open to other ideas

Test Plan


Diff Detail

R1 dagster
Lint OK
No Unit Test Coverage

Event Timeline

nate created this revision.Sep 11 2019, 1:12 AM
nate retitled this revision from RFC: rough implementation of compute function retries to RFC: rough sketch of compute function retries.Sep 11 2019, 1:21 AM
nate edited the summary of this revision. (Show Details)
nate added reviewers: alangenfeld, schrockn.

hm did you investigate making the change at the compilation step ie step.compute_fn ? I guess the question is whether or not the engines should be aware of the retries or not.

schrockn requested changes to this revision.Sep 11 2019, 10:21 PM

going to do run-based retry first yes?

This revision now requires changes to proceed.Sep 11 2019, 10:21 PM
nate abandoned this revision.Oct 2 2019, 4:10 PM

gonna abandon this for now, will put up something else when we're ready to implement task-level retry semantics