Page MenuHomePhabricator

Dagster Type Guide
ClosedPublic

Authored by schrockn on Thu, Feb 6, 7:33 PM.

Details

Summary

Adding Dagster Type Guide to explain the new API

Test Plan

read

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

General comments on first pass through. It's looking good!

docs/sections/learn/guides/dagster_types.rst
45

can we throw all code blocks for documentation into examples and refer to them via code lines. This way whenever somebody changes a call-site tests fail notifying them to fix something rather than hoping that someone will remember to update the docs during review time.

49

might be worth getting rid of these comments. It's noisy.

113

Is this really the right way to do this? I thought the intention was to have the assert predicate in a type check method? This seems sketchy as you would get assertion errors during runtime. This is bad because it is possible to turn off asserts globally in the python interpreter and you loose all the metadata that you get with a type check failure.

Maybe I am missing a lot of context here from a past revision, in which case, this shouldn't block the revision.

152

the general case?

194

I feel like we can be a lot more succinct in this section.

198

Might be worth not mentioning the intermediate store, if a person hasn't read about intermediates, this could be confusing. Metadata is a pretty simple concept to explain on its own without introducing a dagster concept.

max added inline comments.Mon, Feb 10, 9:49 PM
docs/sections/learn/guides/dagster_types.rst
7

somewhere in here, maybe at the bottom, we should have some verbiage like "The dagster type system is independent from the PEP 484 Python type system, although we overload the type annotation syntax on functions to make it easier to specify the input and output types of your solids. Python classes are *not* necessarily Dagster types. If you're using a static type checker like mypy, please...."

9

A solid's metadata describes the conditions -- the state of the external world -- that must hold for the computation to succeed.

16

this could usefully be reworded

17

concise example could be useful. really solids declare a schema for their configuration.

18

hm, interesting -- what are you trying to evoke by this distinction? i presume it's to point out to the reader that not everything on which a solid operates has to flow between solids. but fwiw, even in the case of a database table, the reference to the table is managed by the intermediate store.

27

this is interesting but a little arcane -- not sure the best place to put this tho there is obv a class of readers who will appreciate this pointer

33

the typechecks are flexible, but i don't see what they have to do with being optional

35

from existing (data?)

49

hm, i think this is the best way to communicate the signature of the type_check_fn

78

*static

81

type_check_fn

93

i think you mean, a python type that is directly usable as a dagster type

94

this is potentially misleading, i would qualify it -- "there is a 1:1 relationship between each of these python types"

138

by their inputs

139

for a solid

140

use either the singular or the plural for data but not both in the same para

148

cut this line

185

provide example

190

consolidate para into one sentence

196

if i understand this correctly, you're trying to point to a couple of cross-cutting concerns:

  1. is the data passed from solid to solid, or is something else (metadata, a reference to data, or a sentinal value) passed instead? i.e., what does the intermediates machinery & serialization actually touch, what is the overhead, etc.
  1. how would you stub each of these types of input or output for test?
  1. what is it possible for a type check to usefully assert in each case?
248

this parenthetical note should be expanded into a separate section on fan-in

283

define their own

289

both in type annotations..

293

@; link to the sphinx autogen docs using :py:func:

337

maybe flesh this out

342

therefore.. must not

359

is there an open issue

docs/sections/learn/guides/dagster_types.rst
49

Implementers choice, but I would prefer regular english. The signature of the type_check_fn is explicitly described in english below this. If people really want to know signatures, we should have this in docstrings. Right now, I think it distracts more than it helps.

alangenfeld requested changes to this revision.Mon, Feb 10, 9:57 PM

Q mgmt

This revision now requires changes to proceed.Mon, Feb 10, 9:57 PM
schrockn added inline comments.Tue, Feb 11, 1:04 AM
docs/sections/learn/guides/dagster_types.rst
45

ya

93

sure do!

94

not following this one

113

Well, this contrived example is imagining a business object, however contrived. Imagine this class being used outside of dagster. This is a reasonable thing to do there. The point is that meaningful validation is done at construction time. Once constructed the only think dagster needs to do is check instanceof.

140

google docs failed me!

152

this is more of a note to myself and us. we need to finalize this before wed

schrockn added inline comments.Tue, Feb 11, 1:05 AM
docs/sections/learn/guides/dagster_types.rst
45

early return.

ya that is the plan. should have made that clear in summary. getting feedback on core content before getting down and dirty on the literal includes, which are a pain

schrockn marked 20 inline comments as done.Tue, Feb 11, 10:16 PM
schrockn added inline comments.
docs/sections/learn/guides/dagster_types.rst
17

imo this should be left for different part of the document. (or the config document). Would like to leave this intro section example code free

18

I'm setting up the taxonomy we describe later in terms of different types of inputs. The data dependencies are really the super set of data, metadata, Nothings, etc. Trying to tee that up

27

yeah agree. also not useful to include given that we haven't implemented this nature of tooling ourselves.

33

the default to 'Any', combined with the arbitrary strictness of the typecheck, make the type system qualify as optional

49

we can leave the sig for the docblock

185

I'm going to leave this as a TODO for now

185
196

I'm just trying to run through the use cases and delienate what it gets you in each case

196

I ran this by dwall and he said it was clarifying, if we are looking for a data point that isn't us

198

Generally the tone of these guide are less tutorial and more definitive source in which case we should feel free to refer to other concepts I think

248

Added issue here

248
337

hmmmm document is quite long already

359

in mypy? no idea

schrockn updated this revision to Diff 9544.Tue, Feb 11, 10:20 PM
schrockn marked 6 inline comments as done.

feedback; next step is to switch to code literal includes

alangenfeld accepted this revision.Wed, Feb 12, 1:13 AM

this is within striking distance - accepting so you can proceed at your discretion

do a final pass against previous in line comments before you land

docs/sections/learn/guides/dagster_types.rst
5–41

maybe a lot for one section - consider some subheadings?

46
:py:class:`DagsterType`

dont need to go crazy - but I feel this instance of DagsterType should deep link to API docs

52

update the type check functions now that i landed the context change

90

:py:class: these if they dont already deep link

202

:py:class Nothing

209–210

to express a data dependencies

to express one
to express a data dependency

211–212

are operational concerns only

reads weird

284

^ marked as done but typo still present

examples/dagster_examples_tests/intro_tutorial_tests/test_type_guide.py
25

this file will need updates post rebase

This revision is now accepted and ready to land.Wed, Feb 12, 1:13 AM
max added inline comments.Wed, Feb 12, 7:19 PM
docs/sections/learn/guides/dagster_types.rst
33

well, it's optional whether to impose types at all -- then the type checks can be as strict as they want to be. not worth harping on and i hate every part of my own soul that is tempted to care about type system terminology.

46

i encourage going crazy on this stuff, it feels insane to write but is so helpful for users

78

*systems

94

what i mean is that there isn't a 1:1 relationship between every python type and a corresponding dagster type -- just *these* types -- i stumbled when i first read this sentence

113

"Suppose you have a business object that does all of its meaningful validation when it is constructed..."

185

cf the custom types tutorial

214

Uses of the :py:class:~dagster.Nothing type typically point to a situation in which although there does exist some semantic dependency between two solids, usually to do with external state -- "such and such a table was created in a database by the upstream solid", "such and such upstream solid ran successfully within the last SLA period" -- that dependency isn't yet described and encoded in metadata. In general, this kind of dependency makes a pipeline harder to understand, and solids within that pipeline harder to test and reuse. or sth

291

as deliver

schrockn retitled this revision from Dagster Type Guide Stab at dagster docs to Dagster Type Guide.Wed, Feb 12, 7:51 PM
schrockn edited the summary of this revision. (Show Details)
This revision was automatically updated to reflect the committed changes.