Page MenuHomePhabricator

Incrementally fetch data for partition set UI
ClosedPublic

Authored by bengotow on Tue, Oct 6, 11:36 PM.

Details

Summary

Earlier today we talked with a customer who was running 200+ pipeline runs per partition because the pipeline was re-run every 5 minutes during the day, regardless of success or failure.

While this is super unusual and our UI is designed to help you achieve one successful run per day, the UI was completely unusable for him because the query for the last 30 partitions required data from 6800 runs and took 8 MINUTES.

This is a first pass at improving the usability of this page:

  • Fetches two partitions at a time, incrementally loading the charts and the matrix view
  • Removes auto-refresh every 15 seconds
  • Replaces some custom code in the PartitionView with pagination patterns used for Runs, etc.

I think there's a potential for this approach to backfire if the UI gets slammed with updates faster than it can re-render, but even the longitudinal pipeline streams in at a comfortable pace that the browser can handle. For very large pipelines we may see "stickiness" if React renders take 100+ms, but I propose we optimize the UI more to handle that since search, etc interactions require re-renders as well.

Test Plan

Updated snapshots

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

This is amazing! I actually really enjoyed reading your solution here – I learned a ton :)

I added a few nits and questions below, but this generally looks great to me. I'll also let @dish take a look and do a final approval.

js_modules/dagit/src/partitions/PartitionView.tsx
34–36

This conditional would never be true, right? Because partitions is always at least an empty array, which is a truthy value in JS.

34–38

Just for my understanding: It looks like onLoaded calls setShowLoader(true) which shows the loading spinner. What sets setShowLoader back to false once all the partitions are loaded?

65–68

This is a pretty rare edge case, but as a follow on from the question above: if our pipeline actually has an empty partition set do we just display an infinite spinner?

js_modules/dagit/src/partitions/useChunkedPartitionsQuery.tsx
50

I'm not sure if this actually breaks rendering so it might not be a big deal, but one of our pageSize options is 7 and we have a fixed limit of 2, so I think results will end up having one extra element.

I believe the fix is just making this:

limit: Math.min(2, pageSize - accumulated.length)
53–55

Does this mean one of pageSize, cursor, client, partitionSetName, repositoryName, repositoryLocationName has changed by the time this query has finished executing, so we just want to discard the result and stop paginating?

56

[1] Can remove the ! here if you return [] instead of undefined

101

Nice :)

109–114

Nit: I'm not sure if you would actually ever run into either of these two cases since these conditionals seem to be here for refining the type of result, but I think you want to return an empty array in both of these cases, or else you get a runtime error at [1] above.

Fetches two partitions at a time, incrementally loading the charts and the matrix view

I wonder if even this is too much for the amount of data Auster has. It would be worth trying this on with partitions that have at least 200 runs per partition. I can try to test that if you don't already have the data locally.

sashank requested changes to this revision.Mon, Oct 12, 9:36 PM

Back to your queue

This revision now requires changes to proceed.Mon, Oct 12, 9:36 PM

Hey @sashank thanks for the careful review, I appreciate it! Going to address your comments and handle a couple of those edge cases better.

js_modules/dagit/src/partitions/PartitionView.tsx
34–36

Ahh yep it looks like this can be removed!

34–38

Hey hey! Yep this was a bit strange in the old code, we had a stray call to onLoaded (on line 77) and I believe it just removes the spinner, which is visible by default when the page loads. It may be that this isn't as necessary anymore now that the page starts to load faster.

65–68

Yep it looks like that's true... will see if I can fix this!

js_modules/dagit/src/partitions/useChunkedPartitionsQuery.tsx
50

Ahh you're right, will fix this!

53–55

Yep! This is a bit of an awkward pattern but we keep the "live" version number in a ref and then capture the value at the start of loading in a locally scoped variable. If the data we fetched is no longer for the current version, we know that the useEffect has triggered again and this chain of fetch calls shouldn't be running anymore.

56

Good call will fix!

109–114

Yeah, I think this is mostly to make TypeScript happy. I've gone ahead and changed it to [] and that's definitely cleaner

  • Remove old loader, address diff feedback, add new loading state to chunked query

Some minor comments -- the only one that strikes me as a maybe-bug is the pop followed by slice.

js_modules/dagit/src/CursorControls.tsx
52

More and more, I'm wanting to make some core components that allow us to bake in basic flex/alignment/margin/padding properties so that we don't need to do style or one-off styled components. Now that we have Storybook up and running, I might dig into that a bit.

60

"Next" being "Older" is kind of confusing here. (It could also be confusing if "Next" meant "Newer"!) How would you feel about matching the prop names to the behavior?

interface CursorPaginationProps {
  hasOlder: boolean;
  hasNewer: boolean;
  onOlderPage: () => void;
  onNewerPage: () => void;
};
js_modules/dagit/src/partitions/PartitionPageControls.tsx
35

Could this be replaced by having <ButtonGroup> and children inside of a container div, with PartitionPagerContainer then having justify-content: space-between?

js_modules/dagit/src/partitions/PipelinePartitionsRoot.tsx
24

Should be able to do this:

const {pipelineName} = explorerPathFromString(match.params.pipelinePath);
js_modules/dagit/src/partitions/useChunkedPartitionsQuery.tsx
25–29

Given the number of state values tracked here and how often state values are being updated at the same time, this looks like a nice opportunity for useReducer.

29

I'm not a huge fan of using undefined instead of null in situations like this, but that's just me being pedantic. :)

56

It looks like in this case, the result will have been fetched but is then discarded. Should the version check come first instead?

80–81

Since pop is mutative, following that with the slice looks like you'll end up removing two values in total from cursorStack. Is that intentional?

111

return Array(count).fill(null).map((_, ii) => ({
  __typename: 'Partition',
  // etc.
}));
This revision is now accepted and ready to land.Wed, Oct 14, 9:51 PM

Thanks for the feedback! Going to apply everything and then get this merged so I can stack the next round of changes on it.

js_modules/dagit/src/CursorControls.tsx
60

Ahh interesting... I was initially trying to re-use this CursorPaginationProps interface from the generic "usePaginatedQuery" hook, but I agree the sort order of this data in particular makes this super confusing.

Since the component that consumes the pagination props is also custom, I feel like we could just change the props, but I was hoping the Newer/Older and Prev/Next controls could be used interchangeably.

I might try changing these props to reflect how the underlying cursor is being moved, since calling it hasOlder would just move the confusion into the data provider which still has to go to the "next" cursor. Maybe:

{
   "hasPrevCursor":
   "hasNextCursor"
    "onAdvanceCursor"
    "onPopCursor
}
js_modules/dagit/src/partitions/PartitionPageControls.tsx
35

Yep that'll work too, I can swap it out.

I think I've been doing this a lot this year so that the behavior isn't dependent on the number of children but I don't feel strongly. I sort of prefer one flat layer with explicit spacers to nested flexboxes once the content gets more complicated, but that's definitely not the case here. (Though I will still make the "left items" div another flexbox because align-items:center seems like the best way to make items of varying height align nicely)

js_modules/dagit/src/partitions/PipelinePartitionsRoot.tsx
24

Strange it looks like this is still the case in a bunch of places, will swap all these out in a separate diff!

js_modules/dagit/src/partitions/useChunkedPartitionsQuery.tsx
25–29

Ahh you're definitely right, 4 pieces of state is probably a threshold for useReducer. I honestly don't use it much with typescript because defining intermediate types for the actions is a bit annoying, but let me see if I can clean this up! Would probably also be a good place to just throw them in a single state object and update individual members via an object spread.

29

Haha no I think that's correct 👍

56

Yep that's correct - the version check can't come before the fetch because the version can change during the async call, and we need to know that the data we requested is no longer the data we want.

80–81

Ahh good catch 👍

This revision is now accepted and ready to land.Sun, Oct 18, 8:07 PM