Page MenuHomePhabricator

RFC: composable type loaders and materializers
Needs ReviewPublic

Authored by schrockn on Tue, Nov 17, 1:30 PM.



This is not urgent and not even a real RFC. More of an
exploration. I wanted to put something out there to solve what I deemed
the "composability problem" in the type system. The thesis is that in
reality our types and all the attached artitfacts are actually just
business logic and quite application-specific. Therefore we should
providde APIs to build types rather than to have prefabricated types.

The first step is making a layer and allows one to compose type loaders
and materializers out of individual entries. This is cleaner as you no
longer have to manage the construction of the Selector and a monolithic
config processing function. More interestingly I think this will allow
things like cross-cutting changies to loaders and materializer
behaviors. E.g. having pre-built entries for interacting with asset
stores or filtering out entries based on mode.

This also plays with the idea of having a "builder" api that builds
types by annotating the core typecheck fn.

Test Plan


Diff Detail

R1 dagster
Lint OK
No Unit Test Coverage

Event Timeline

Harbormaster returned this revision to the author for changes because remote builds failed.Tue, Nov 17, 1:48 PM
Harbormaster failed remote builds in B21256: Diff 25784!
schrockn edited the summary of this revision. (Show Details)
schrockn edited the test plan for this revision. (Show Details)
schrockn added reviewers: alangenfeld, sandyryza.


I think the thrust of this is purely good, exposing this capability of building up from a list of loader entries.
Open questions

  • TypeBuilder as separate versus just adding loader_entries on DagsterType
  • making it easy to build ScalarUnion[T, Selector] in the case where a direct value also makes sense

I strongly agree with the thinking behind this. Agree with Alex that this is a strict improvement over what we have.

A few thoughts:

  • I think it's worth thinking through whether AssetStores can subsume loaders and materializers. (Not something I have developed an opinion on.)
  • I think a lot of the need for selectors in materializers and loaders comes from the fact that we try to provide out-of-the-box materializers and loaders, which necessarily need to be flexible. I would suspect that, if users were expected to write their own application-specific, those often would not need selectors.
  • Did you consider implementing this via composite materializers / loaders? I.e. something like my_materializer = composite_materializer({"parquet": parquet_materializer, "csv": csv_materializer}). Would avoid introducing new concepts. Misses out on the magic of using the entry function names as the selector keys, but less magic can be a good thing.