Page MenuHomePhabricator

Refactor PipelineRunStorage
ClosedPublic

Authored by max on Aug 7 2019, 6:53 PM.

Details

Reviewers
alangenfeld
Group Reviewers
Restricted Project
Commits
R1:524361d9a76d: Refactor PipelineRunStorage
Summary

Preliminary to further work on distributed logging

Test Plan

Unit

Diff Detail

Repository
R1 dagster
Branch
logsink
Lint
Lint WarningsExcuse: wip
Unit
No Unit Test Coverage

Event Timeline

max created this revision.Aug 7 2019, 6:53 PM
max updated this revision to Diff 3536.Aug 8 2019, 6:48 PM

Refactor

max retitled this revision from Wip to Refactor PipelineRunStorage.Aug 8 2019, 6:54 PM
max edited the summary of this revision. (Show Details)
max added a reviewer: Restricted Project.
alangenfeld added inline comments.
python_modules/dagit/dagit/cli.py
57–65

for your consideration in follow ups - --log is a bit goofy of a set up for determining storage, we should come up with a more explicit scheme. dagit --storage fs or something like that

python_modules/dagster-graphql/dagster_graphql/implementation/pipeline_run_storage.py
36–43

kinda feel like we can drop pipeline here, RunStorage communicates whats going on just as effectively imo.

125

this is used by both dagster-graphql and dagit now so this err message is stale. Also could do a better job communicating that its just moving on instead of failing outright

python_modules/dagster/dagster/core/events/log.py
38

while a small code change, this is a pretty big system change - can you explain what the motivation here is? Just want to make sure we think through this since I reckon theres no going back as something will depend on these ids quickly.

alangenfeld requested changes to this revision.Aug 8 2019, 7:15 PM

to your queue

This revision now requires changes to proceed.Aug 8 2019, 7:15 PM
max added inline comments.Aug 8 2019, 9:13 PM
python_modules/dagit/dagit/cli.py
57–65

yep

python_modules/dagster-graphql/dagster_graphql/implementation/pipeline_run_storage.py
36–43

ok

125

ok

python_modules/dagster/dagster/core/events/log.py
38

yep, i am trying to make sure we have a way to tell if we've seen an event record before or not. i think this will simplify the case where we have two dagit processes writing/reading from a shared filesystem run storage

max updated this revision to Diff 3546.Aug 8 2019, 9:29 PM

Fixup tests

max updated this revision to Diff 3548.Aug 8 2019, 9:31 PM

Black

alangenfeld accepted this revision.Aug 9 2019, 4:45 PM
alangenfeld added inline comments.
python_modules/dagster/dagster/core/events/log.py
38

I'm inclined to punt this until we know the exact constraints of what we are trying to solve, for example I can imagine choosing a different scheme than uuid

This revision is now accepted and ready to land.Aug 9 2019, 4:45 PM
max added inline comments.Aug 9 2019, 5:02 PM
python_modules/dagster/dagster/core/events/log.py
38

what do you have in mind? i think hashing is going to suck

max updated this revision to Diff 3554.Aug 9 2019, 5:07 PM

fix dflt

max updated this revision to Diff 3555.Aug 9 2019, 5:35 PM

Omit id for now

max updated this revision to Diff 3556.Aug 9 2019, 5:37 PM

black

max updated this revision to Diff 3558.Aug 9 2019, 6:04 PM

Rebase

max updated this revision to Diff 3559.Aug 9 2019, 6:06 PM

Black

This revision was automatically updated to reflect the committed changes.