Page MenuHomePhabricator

Refactor PipelineRunStorage
ClosedPublic

Authored by max on Aug 7 2019, 6:53 PM.

Details

Reviewers
alangenfeld
Group Reviewers
Restricted Project
Commits
R1:524361d9a76d: Refactor PipelineRunStorage
Summary

Preliminary to further work on distributed logging

Test Plan

Unit

Diff Detail

Repository
R1 dagster
Branch
logsink
Lint
Lint OK
Unit
No Unit Test Coverage

Event Timeline

max created this revision.Aug 7 2019, 6:53 PM
max updated this revision to Diff 3536.Aug 8 2019, 6:48 PM

Refactor

max retitled this revision from Wip to Refactor PipelineRunStorage.Aug 8 2019, 6:54 PM
max edited the summary of this revision. (Show Details)
max added a reviewer: Restricted Project.
alangenfeld added inline comments.
python_modules/dagit/dagit/cli.py
55–63 ↗(On Diff #3538)

for your consideration in follow ups - --log is a bit goofy of a set up for determining storage, we should come up with a more explicit scheme. dagit --storage fs or something like that

python_modules/dagster-graphql/dagster_graphql/implementation/pipeline_run_storage.py
32 ↗(On Diff #3538)

kinda feel like we can drop pipeline here, RunStorage communicates whats going on just as effectively imo.

124 ↗(On Diff #3538)

this is used by both dagster-graphql and dagit now so this err message is stale. Also could do a better job communicating that its just moving on instead of failing outright

python_modules/dagster/dagster/core/events/log.py
38 ↗(On Diff #3538)

while a small code change, this is a pretty big system change - can you explain what the motivation here is? Just want to make sure we think through this since I reckon theres no going back as something will depend on these ids quickly.

alangenfeld requested changes to this revision.Aug 8 2019, 7:15 PM

to your queue

This revision now requires changes to proceed.Aug 8 2019, 7:15 PM
max added inline comments.Aug 8 2019, 9:13 PM
python_modules/dagit/dagit/cli.py
55–63 ↗(On Diff #3538)

yep

python_modules/dagster-graphql/dagster_graphql/implementation/pipeline_run_storage.py
32 ↗(On Diff #3538)

ok

124 ↗(On Diff #3538)

ok

python_modules/dagster/dagster/core/events/log.py
38 ↗(On Diff #3538)

yep, i am trying to make sure we have a way to tell if we've seen an event record before or not. i think this will simplify the case where we have two dagit processes writing/reading from a shared filesystem run storage

max updated this revision to Diff 3546.Aug 8 2019, 9:29 PM

Fixup tests

max updated this revision to Diff 3548.Aug 8 2019, 9:31 PM

Black

alangenfeld accepted this revision.Aug 9 2019, 4:45 PM
alangenfeld added inline comments.
python_modules/dagster/dagster/core/events/log.py
38 ↗(On Diff #3538)

I'm inclined to punt this until we know the exact constraints of what we are trying to solve, for example I can imagine choosing a different scheme than uuid

This revision is now accepted and ready to land.Aug 9 2019, 4:45 PM
max added inline comments.Aug 9 2019, 5:02 PM
python_modules/dagster/dagster/core/events/log.py
38 ↗(On Diff #3538)

what do you have in mind? i think hashing is going to suck

max updated this revision to Diff 3554.Aug 9 2019, 5:07 PM

fix dflt

max updated this revision to Diff 3555.Aug 9 2019, 5:35 PM

Omit id for now

max updated this revision to Diff 3556.Aug 9 2019, 5:37 PM

black

max updated this revision to Diff 3558.Aug 9 2019, 6:04 PM

Rebase

max updated this revision to Diff 3559.Aug 9 2019, 6:06 PM

Black

This revision was automatically updated to reflect the committed changes.