Page MenuHomePhabricator

(Depends on D4070) Incorporated IntermediateStore features into IntermediateStorage, fixed tests and subclasses accordingly.
ClosedPublic

Authored by cdecarolis on Aug 4 2020, 9:05 PM.

Details

Summary

IntermediateStore acts as a layer of indirection between IntermediateStorage and ObjectStore, but it more obfuscates how they are related than helps. This change aims to deprecate and remove IntermediateStore entirely by moving its functionality a layer up, to the IntermediateStorage layer.

Test Plan

Modified IntermediateStore tests to instead test the new IntermediateStorage functionality. Propagated the removal of IntermediateStore to a ton of tests.

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

cdecarolis created this revision.Aug 4 2020, 9:05 PM
Harbormaster returned this revision to the author for changes because remote builds failed.Aug 4 2020, 9:19 PM
Harbormaster failed remote builds in B16476: Diff 20088!
cdecarolis updated this revision to Diff 20101.Aug 4 2020, 11:41 PM

Improved testing coverage

Harbormaster returned this revision to the author for changes because remote builds failed.Aug 4 2020, 11:56 PM
Harbormaster failed remote builds in B16487: Diff 20101!
sandyryza added inline comments.
python_modules/libraries/dagster-aws/dagster_aws/s3/intermediate_storage.py
39

There's a function build_intermediate_storage_from_object_store. Would we be able to use that in places where we'd use this? It seems like there's a decent bit of duplicated code between the IntermediateStore / IntermediateStorage implementations for S3, GCS, and ADLS2.

python_modules/libraries/dagster-azure/dagster_azure/adls2/intermediate_storage.py
8

Does this need to be kept around? Does it get used anywhere?

python_modules/libraries/dagster-gcp/dagster_gcp_tests/gcs_tests/test_intermediate_storage.py
129

Would it make sense to call this intermediate_storage?

cdecarolis added inline comments.Aug 5 2020, 2:52 PM
python_modules/libraries/dagster-aws/dagster_aws/s3/intermediate_storage.py
39

I'll take a look and see what I can do

python_modules/libraries/dagster-azure/dagster_azure/adls2/intermediate_storage.py
8

To my knowledge it's been entirely deprecated, but I wanted to have a separate diff that handled all the removal of these classes / complete deprecation of IntermediateStore.

python_modules/libraries/dagster-gcp/dagster_gcp_tests/gcs_tests/test_intermediate_storage.py
129

Good point, fits the new naming schema better.

cdecarolis updated this revision to Diff 20185.Aug 5 2020, 8:11 PM

Added support for TypeStoragePlugin and made the necessary testing and api changes to get it to work properly.

Harbormaster returned this revision to the author for changes because remote builds failed.Aug 5 2020, 8:25 PM
Harbormaster failed remote builds in B16561: Diff 20185!
cdecarolis updated this revision to Diff 20188.Aug 5 2020, 8:33 PM

Fixed lint issue

Harbormaster returned this revision to the author for changes because remote builds failed.Aug 5 2020, 8:47 PM
Harbormaster failed remote builds in B16563: Diff 20188!
cdecarolis updated this revision to Diff 20203.Aug 5 2020, 10:13 PM

Completely deprecated and removed IntermediateStore.

Harbormaster returned this revision to the author for changes because remote builds failed.Aug 5 2020, 10:28 PM
Harbormaster failed remote builds in B16576: Diff 20203!
cdecarolis updated this revision to Diff 20211.Aug 5 2020, 10:48 PM

Fixed tests

Harbormaster returned this revision to the author for changes because remote builds failed.Aug 5 2020, 11:03 PM
Harbormaster failed remote builds in B16583: Diff 20211!
cdecarolis updated this revision to Diff 20281.Aug 6 2020, 4:15 PM

Fixed an issue with adls tests

cdecarolis updated this revision to Diff 20286.Aug 6 2020, 4:29 PM

Fixed azure test again

Harbormaster returned this revision to the author for changes because remote builds failed.Aug 6 2020, 4:46 PM
Harbormaster failed remote builds in B16638: Diff 20286!
cdecarolis updated this revision to Diff 20293.Aug 6 2020, 4:47 PM

Updated azure tests for TypeStoragePlugin

Harbormaster returned this revision to the author for changes because remote builds failed.Aug 6 2020, 5:17 PM
Harbormaster failed remote builds in B16644: Diff 20293!
cdecarolis edited the summary of this revision. (Show Details)Aug 6 2020, 5:25 PM
cdecarolis edited the test plan for this revision. (Show Details)
cdecarolis added reviewers: sandyryza, nate.
cdecarolis updated this revision to Diff 20297.Aug 6 2020, 6:15 PM

rebased onto master

Harbormaster returned this revision to the author for changes because remote builds failed.Aug 6 2020, 6:34 PM
Harbormaster failed remote builds in B16648: Diff 20297!
cdecarolis updated this revision to Diff 20350.Aug 7 2020, 3:28 PM

Rebased onto master

cdecarolis requested review of this revision.Aug 7 2020, 3:43 PM
sandyryza accepted this revision.Aug 10 2020, 5:09 PM

This looks great! My one lingering concern is about whether we need to maintain compatibility with the TypeStoragePlugin interface for now.

python_modules/dagster/dagster/core/storage/type_storage.py
21–23

I think this is the right direction. However, this is technically a public API that we would be breaking, so I think it probably makes sense to at least separate this part out to a separate change.

If that's a big pain to do at this point, then best would be to post in slack to find out whether anyone thinks it's important to preserve compatibility.

This revision is now accepted and ready to land.Aug 10 2020, 5:09 PM
cdecarolis added inline comments.Aug 11 2020, 1:17 PM
python_modules/dagster/dagster/core/storage/type_storage.py
21–23

That makes sense. While it would be possible to separate out this part of the change, I do think it would end up being quite an involved one. I will post in slack and find out what people think.