Page MenuHomeElementl

[memoization 13/13] revamp memoization docs
Needs ReviewPublic

Authored by cdecarolis on Aug 10 2021, 10:05 PM.
Referenced Files
F2982825: D9219.id42854.diff
Sun, May 28, 9:20 AM
Unknown Object (File)
Tue, May 16, 11:59 PM
Unknown Object (File)
Thu, May 11, 2:07 AM
Unknown Object (File)
Tue, May 9, 11:44 AM
Unknown Object (File)
Sat, Apr 29, 7:58 PM
Unknown Object (File)
Sat, Apr 29, 5:53 AM
Unknown Object (File)
Apr 24 2023, 8:52 PM
Unknown Object (File)
Apr 10 2023, 9:16 AM



This attempts a complete revamp of the memoization tutorial. Since a lot of the boilerplate around using memoization is gone now, the old format doesn't make as much sense.

I'm thinking it actually makes more sense to have a memoization overview page than a tutorial at this point. I tried to go in that direction by using more self-contained code examples, and revamping into smaller, more targeted sections that people can search for specific information in.

In addition to general cleanup, I think the biggest question for me is how to show people memoization "working". In some sense, it might not even be necessary to do this, but it would probably be useful for initial sanity checking purposes.

Another note: in this current iteration, I have removed mention of the list_versions CLI. Given that we have relatively low CLI traffic, it seemed mis-targeted. I'm open to other opinions around that though cc @sandyryza .

Test Plan

will add unit tests for code examples.

Diff Detail

R1 dagster
Lint Passed
No Test Coverage

Event Timeline

Harbormaster returned this revision to the author for changes because remote builds failed.Aug 10 2021, 10:28 PM
Harbormaster failed remote builds in B34640: Diff 42854!
cdecarolis retitled this revision from [memoization 12/12] revamp memoization tutorial to [memoization 13/13] revamp memoization tutorial.Aug 11 2021, 12:20 AM
cdecarolis retitled this revision from [memoization 13/13] revamp memoization tutorial to [memoization 13/13] revamp memoization docs.Aug 11 2021, 2:04 AM
cdecarolis edited the summary of this revision. (Show Details)
cdecarolis edited the test plan for this revision. (Show Details)

cc @alangenfeld , I know we talked about this not being the best approach. I kept it here as a placeholder for now.

I do think our forward example should track code. It might be best to just caveat that it's not gonna catch everything, and suggest an approach that can be a bit more thorough?


Does anyone think it's necessary to show "memoization in action"? I feel like the previous iteration focused too much on that, making it feel more like a demo than a guide, but maybe I've gone too far in the other direction here. One option would be showing logs in dagit before and after memoization (before we see step events, after we see no step events, after version change we see some step events).


To compliment this section, I want to add a "how memoization works" page, that explains in more literal detail how each version is constructed. Where would be the best place for that? cc @yuhan


This sentence is very confusing, and I'm thinking can benefit from a demonstrable example.


I need to actually link to the implementation here.


Need to add a code example for this.


Need to add a code example for this as well.


ya the comment just needs to be expanded to clearly communicate what it captures and what its limitations are. I think i am fine with whatever as long as its very clearly communicated

@cdecarolis would you mind submitting this as a github PR? Submitting docs changes as a github PR lets us preview the build docs using Vercel.


"example" -> "guide"


Suggestion for replacing the second sentence in a way that allows avoiding the verb "to be" and ties directly to the first sentence:

Versions allow users to express to Dagster whether outputs have changed and thus require re-computation.


I find this name a little strange - it's not directly tracking changes, it's just tracking the op's source code and leaving it up to Dagster to determine whether the version has changed. Might be clearer as OpSourceHashStrategy or something.


I don't think so.


Nice on including imports with each code snippet.

cdecarolis marked 5 inline comments as done and 2 inline comments as done.

Address comments. Add tests for guide. Add tags to execute_in_process. Add resource_key to version strategy s.t. you can differentiate between resources if you choose.

cc @sandyryza since this stack is making heavy use of the stacked diffs workflow atm, I think switching to a PR might be a bit rough. i'll definitely put up screenshots though.