Page MenuHomeElementl

Needs RevisionPublic

Authored by sandyryza on Feb 10 2021, 12:51 AM.



We've gotten a few questions on github and slack about deciding how to load inputs and outputs based on type. For users who don't care about storing their intermediates in a data warehouse, but who want to make sure their unpickleable types can make it across process boundaries, this is a pretty salient need.

I tried to write an example, and it ended up as a decent bit of code. This proposes adding a Dagster-provided user-space utility for constructing a type-based IO manager, to ease the process.

There are some hairy bits that come from a resource wrapping other resources.

Test Plan


Diff Detail

R1 dagster
type-based-io-manager (branched from master)
Lint Passed
No Test Coverage

Event Timeline

sandyryza added reviewers: yuhan, schrockn, cdecarolis.
sandyryza edited the summary of this revision. (Show Details)
Harbormaster returned this revision to the author for changes because remote builds failed.Feb 10 2021, 1:12 AM
Harbormaster failed remote builds in B25536: Diff 31166!

Will take a closer look tomorrow, but the idea feels good to me. A good before-after would of course be the snowflake IO manager on internal.

q mgmt. this is very stale. if it is still in reviewable state bounce back and i'll review

This revision now requires changes to proceed.Mar 17 2021, 6:57 PM