Page MenuHomePhabricator

fix step restart atempts stats calculation
ClosedPublic

Authored by sashank on Thu, Nov 19, 3:53 PM.

Details

Summary

can't add 1 to None this aint php/js

Test Plan

not sure how to produce a repro of this - open to ideas

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

python_modules/dagster/dagster/core/storage/event_log/sql_event_log.py
312–313

should we assume 1 instead of 0 here ? Im still not sure how we got a restart without a start event

This revision is now accepted and ready to land.Thu, Nov 19, 4:10 PM
python_modules/dagster/dagster/core/storage/event_log/sql_event_log.py
215–336

hmm

python_modules/dagster/dagster/core/storage/event_log/sql_event_log.py
232–233

[1]

291–292

we don't need this... we can just add the count to the by_step_query in line 232, right?

sashank edited reviewers, added: alangenfeld; removed: sashank.

add step keys filter, update test to have one other retry step

requesting review again @alangenfeld

python_modules/dagster/dagster/core/storage/event_log/sql_event_log.py
224

Don't need this here since we don't use it

291–292

unfortunately that query does a group by event type, so all the restart events get grouped. we need to be able to count each restart event

291–292

which is why it's down here with materializations and step expectation results, where you can also have > 1 of the event type per step key

alangenfeld added inline comments.
python_modules/dagster/dagster/core/storage/event_log/sql_event_log.py
312–313

maybe do 1, maybe leave a comment?

python_modules/dagster/dagster_tests/core_tests/storage_tests/test_local_instance.py
206–211

_count should get bumped, no? Did you make sure this test failed without the fix?

This revision is now accepted and ready to land.Thu, Nov 19, 7:33 PM
python_modules/dagster/dagster/core/storage/event_log/sql_event_log.py
291–292

But couldn't you add a count to the selection and just use that?

e.g.

by_step_query = (
    db.select(
        [
            SqlEventLogStorageTable.c.step_key,
            SqlEventLogStorageTable.c.dagster_event_type,
            db.func.max(SqlEventLogStorageTable.c.timestamp).label("timestamp"),
            db.func.count(SqlEventLogStorageTable.c.id).label("count"),
        ]
    )
    .where(SqlEventLogStorageTable.c.run_id == run_id)
    .where(SqlEventLogStorageTable.c.step_key != None)
    .where(SqlEventLogStorageTable.c.dagster_event_type.in_(STEP_STATS_EVENT_TYPES))
)
  • fix test case and set default to 1
python_modules/dagster/dagster/core/storage/event_log/sql_event_log.py
291–292

good call

  • use count instead of iteration

@prha updated to use count

This revision is now accepted and ready to land.Thu, Nov 19, 8:34 PM

Should note that there is a logical difference here in the attempts count, depending on what the right behavior is when there is no start event

This revision was landed with ongoing or failed builds.Thu, Nov 19, 9:48 PM
This revision was automatically updated to reflect the committed changes.