This is just an initial commit to see if I can get tests passing. Still need to remove some things here and there to simplify this a bit (ex: the dynamic download pipeline stuff)
After this goes in, what will the procedure be for making changes to the pipelines? Will there be some intermediate period during which we need to replicate changes across both the public repo and the demo repo? Not the end of the world, but ultimately will be pretty painful. Do we have a plan for making the demo depend on the public repo version of the pipeline?
|1 ↗||(On Diff #39224)|
Is this needed anymore?
from typing import NamedTuple: class ParquetPointer(NamedTuple): path: str schema: str
re: the making changes thing, I do foresee some time period where we'll have to manually replicate the changes between these two (ex: once @cdecarolis lands https://dagster.phacility.com/D8236, I'll handle porting that change over here for him). There's a bit of work that will be necessary to actually link these together, because I threw out a fair number of things that only made sense in the internal repo, like HNAPISubsampleClient, in order to make the repo a bit clearer for someone poking around in it.
My hope is that these changes won't be super frequent in the short term, so that the pain will be fairly limited.