Page MenuHomePhabricator

[2] run-scoped file manager resources
Needs RevisionPublic

Authored by sashank on Jul 23 2020, 12:26 AM.

Details

Summary

Stacks on D3953. Recreate the run-id prefixing behavior of system_storage definitions on the file manager resources. If this defeats the purpose of dismantling system_storage as a concept, that could make sense, but I think that this is a reasonable behavior and that going down this route is justified because unlike intermediate_storage, file managers aren't planned to move to the instance level and thus can maintain their relativity to the pipeline run.

Test Plan

tbd

Diff Detail

Repository
R1 dagster
Branch
more-file-manager-resources
Lint
Lint OK
Unit
No Unit Test Coverage

Event Timeline

aj.nadel created this revision.Jul 23 2020, 12:26 AM

add run_scoped_local_file_manager, change keys

Harbormaster returned this revision to the author for changes because remote builds failed.Jul 23 2020, 12:57 AM
Harbormaster failed remote builds in B15822: Diff 19347!

I like this as an in-between step for file managers, tbch

aj.nadel requested review of this revision.Jul 27 2020, 7:06 PM
max added a comment.Jul 30 2020, 4:01 PM

requesting changes for the os-specific path construction, but i like this. i'd like to see tests and fleshed out docstrings with examples

python_modules/dagster/dagster/core/storage/file_manager.py
280

hm, if we want the cwd to be the default base path, let's set this to None and use os.getcwd in the resource function instead. my only worry is these paths don't match the default paths for gcs and s3 -- let's consider {cwd}/dagster/{run_id}/ instead

283

use os.path.join

max requested changes to this revision.Jul 30 2020, 4:01 PM
This revision now requires changes to proceed.Jul 30 2020, 4:01 PM

This LGTM! Maybe we should just replace the non-run-scoped one with this. Thoughts @max?

aj.nadel marked 2 inline comments as done.

up

Unfinished but putting up a couple example docstrings for feedback

python_modules/dagster/dagster/core/storage/file_manager.py
234–296

@max is this approx what you had in mind? I can also do something less kitschy and more practical

python_modules/dagster/dagster_tests/core_tests/storage_tests/test_local_file_manager.py
88–117

Ok to use docstring code examples as tests?

add tests and generate docs from them

max added a comment.Aug 17 2020, 2:24 PM

this looks great, i think i agree with sandy that this is preferable to non-run-scoped.

python_modules/dagster/dagster/core/storage/file_manager.py
174

is there any way to later figure out which uuids are associated with which files?

182

this is nice

204

provide an example for this too

216

typo

231

typo; also, might be nice if the default directory hierarchy made it clearer that these were files, sth like dagster/file_manager/{run_id} ?

274

same typo

python_modules/dagster/dagster_tests/core_tests/storage_tests/test_local_file_manager.py
27

is this true??? if so we should wrap this inside seven.TemporaryDirectory

python_modules/libraries/dagster-aws/dagster_aws/s3/resources.py
109

helpful to give an example

119

fmt

162

dupe?

max added a comment.Aug 17 2020, 2:24 PM

are the test failures real? also, docs for GCS and ADLS2?

sandyryza requested changes to this revision.Aug 18 2020, 4:18 PM

Requesting changes for queue management

This revision now requires changes to proceed.Aug 18 2020, 4:18 PM
sashank commandeered this revision.Tue, Sep 8, 9:45 PM
sashank edited reviewers, added: aj.nadel; removed: sashank.