Page MenuHomeElementl

[memoization 2/n] provide default base dir for versioned fs io manager
ClosedPublic

Authored by cdecarolis on Jul 26 2021, 5:03 PM.

Details

Summary

Retrieve default base dir off instance for default fs io manager

Test Plan

unit tests

Diff Detail

Repository
R1 dagster
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

python_modules/dagster/dagster/core/storage/memoizable_io_manager.py
43

[1]

92โ€“96

is the root instance storage directory right here? Looking at [1] are we worried at all about conflicts between different pipelines/jobs or anything?

Maybe at least put it in a "versioned_outputs" or "versioned" folder inside of "storage"

I just think it would be a bit odd to have all the run_id directories (which is what everything else writing to "storage" does I believe) then also all these solid name directories at the same level.

python_modules/dagster/dagster_tests/core_tests/storage_tests/test_memoizable_io_manager.py
84

nit: this feels like a touch over specific. I doubt it would break since its going to be hard to change these paths, but the exact directory feels less important than that its in the instance directory

python_modules/dagster/dagster/core/storage/memoizable_io_manager.py
84

not directly related to this diff: shall we rename this to versioned_fs_io_manager to pair with fs_io_manager?

92โ€“96

+1 to adding an extra layer like "versioned_outputs"/"versioned"

python_modules/dagster/dagster_tests/core_tests/storage_tests/test_memoizable_io_manager.py
84

should we just call instance.storage_directory() instead of constructing the path here?

cc @yuhan @alangenfeld might abandon this in favor of https://dagster.phacility.com/D9085, but the discussion is still relevant I think. I'm certainly open to having a versioned_outputs directory or something at that layer.

This revision is now accepted and ready to land.Jul 29 2021, 7:25 PM