Page MenuHomePhabricator

3/ Introduce @repository decorator
Needs RevisionPublic

Authored by max on Fri, May 22, 11:08 PM.

Details

Summary

Introduces @repository as shorthand for constructing RepositoryDefinitions directly.

Test Plan

Unit

Diff Detail

Repository
R1 dagster
Branch
reexecution-buton
Lint
Lint OK
Unit
No Unit Test Coverage

Event Timeline

max created this revision.Fri, May 22, 11:08 PM
max requested review of this revision.Fri, May 22, 11:20 PM
alangenfeld added inline comments.Sun, May 24, 4:29 PM
python_modules/dagster/dagster/core/definitions/decorators.py
852

ima file a starter task to break this big file up in to directory

871–917

I think this is an interesting decision - in my head the way this would work is we would take list of heterogenous types: PipelineDefinition, ScheduleDefinition, PartitionSetDefinition, and tuple[str, Callable[[], PipelineDefinition]]

but i could see merits to being less dynamic / magical. Curious to hear yalls thoughts

schrockn requested changes to this revision.Tue, May 26, 1:53 AM

So building on my inline comment. I would suggest two versions of this API.

One the "starter" version, where the repository returns an array of definitions.

Then an "advanced" version that returns a more structured object that is designed for lazy evaluation at definition granularity. Alternatively this could be a wholly separate top-level decorator. However this would return an object like a DefinitionFactory which has an interface with get_pipeline(), get_schedule(), and so forth, providing maximal flexibility for people who want to do stuff like build definitions on demand that are generated from DSLs (e.g. like yaml-driven pipeline definitions).

python_modules/dagster/dagster/core/definitions/decorators.py
896

I think it's a bit strange for the repository decorator to take partition_set_defs and schedule_defs as arguments, but return pipeline defs. This will almost certainly expand to include standalone solid definitions in the future and the who knows?

IMO, one of the goals of this new API should be potentially 100% deferred evaluation. Imagine a huge monorepo with dozens or hundreds of repos.

I would recommend having the easy version of this API return an array of defs. I think we should have the optionality of making the evaluation of that function lazy, and maybe even do that now.

This revision now requires changes to proceed.Tue, May 26, 1:53 AM
max added a comment.Tue, May 26, 7:12 PM

i'm not sure i understand what you have in mind here - are you looking for lazy evaluation of objects other than pipeline definitions?