Page MenuHomePhabricator

[lakehouse] RFC: bake policies into storage defs
ClosedPublic

Authored by sandyryza on Jul 14 2020, 4:25 PM.

Details

Summary

I got frustrated with all the code that was required to define a basic lakehouse. This simplifies the storage interface - a storage now just saves and loads stuff. For the multi-type case where we have different ways of saving / loading Spark DFs vs. Pandas DFs, there's a special storage implementation that dispatches to sub-storages.

Test Plan

bk

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

sandyryza retitled this revision from RFC: bake policies into storage defs to [lakehouse] RFC: bake policies into storage defs.Jul 14 2020, 4:31 PM
sandyryza edited the summary of this revision. (Show Details)
Harbormaster returned this revision to the author for changes because remote builds failed.Jul 14 2020, 4:37 PM
Harbormaster failed remote builds in B15298: Diff 18727!
Harbormaster returned this revision to the author for changes because remote builds failed.Jul 14 2020, 4:53 PM
Harbormaster failed remote builds in B15301: Diff 18730!
Harbormaster returned this revision to the author for changes because remote builds failed.Jul 14 2020, 8:32 PM
Harbormaster failed remote builds in B15338: Diff 18776!
Harbormaster returned this revision to the author for changes because remote builds failed.Jul 15 2020, 12:10 AM
Harbormaster failed remote builds in B15347: Diff 18787!

ya i think this is a bit better. Seems like scaling to lots of types might have some rough ergonomics but I'm sure this isn't the last revision of how the lakehouse storage stuff is set up

This revision is now accepted and ready to land.Jul 20 2020, 3:37 PM