Page MenuHomeElementl

[dagit] Repair layout worker, large DAG loading
ClosedPublic

Authored by dish on Jul 28 2021, 5:24 PM.
Tags
None
Referenced Files
Unknown Object (File)
May 1 2023, 11:00 AM
Unknown Object (File)
May 1 2023, 10:23 AM
Unknown Object (File)
Apr 30 2023, 10:57 PM
Unknown Object (File)
Apr 30 2023, 11:42 AM
Unknown Object (File)
Apr 30 2023, 1:15 AM
Unknown Object (File)
Apr 29 2023, 8:11 PM
Unknown Object (File)
Apr 13 2023, 5:55 PM
Unknown Object (File)
Apr 10 2023, 11:23 PM
Subscribers

Details

Summary

Repair a handful of issues related to loading/rendering large DAGs in Dagit.

  • The async layout WebWorker appears to be broken in development. This looks like it is related to a dependency issue in the worker file, which is ironed out here (titleOfIO). See comment below.
  • Stop using fetch-and-network when querying for solids. These should be static anyway. There is an issue where the page gets into a query loop and never stops trying to update especially large DAGs. I'm still not quite sure what's triggering the loop, but we don't need to do more fetches after the first query anyway. (cc @alangenfeld re: our IRL discussion)
  • Refactor PipelineExplorer to make use of React.useMemo and thus avoid some unwanted extra rendering.
  • Some other implementation cleanup/changes that I made while trying to track down bugs.
  • Show a message when attempting to render a very large number of solids, based on whether we have to use the WebWorker to do so.
Test Plan

Run dagit with -f python_modules/dagit/dagit_tests/stress/megadags.py to generate some pipelines with very large DAGs.

  • View both jobs in Dagit. Verify that there are no longer any query loops when loading the page.
  • Perform some solid subselections. Verify that the "many solids" message appears at appropriate times, and that the WebWorker completes its work and renders the DAG properly.

Diff Detail

Repository
R1 dagster
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Don't export _asyncDagrePipelineLayout

Is there a way to test that the layout worker doesn't have imports (direct or transitive) that will break it?

js_modules/dagit/packages/core/src/graph/layout.ts
3

add a lengthy comment here about the dangers of arbitrary imports?

Is there a way to test that the layout worker doesn't have imports (direct or transitive) that will break it?

Investigating this further, I think I was incorrect about it being a circular dependency issue. I believe that the issue is specific to development and the React import, specifically an with Webpack devserver and react-refresh.

Uncaught ReferenceError: $RefreshSig$ is not defined

When I'm able to eliminate the React import, it seems to be fine. The change in this diff effectively does this, so that's why it works.

The prod build seems to work as-is. 😅

Yeah, I think the important piece is not being reliant on webpack, so that we can actually do the imports asynchronously from the webworker.

Is there a way to run a test to make sure that the worker compiles? Seems like an easy thing to break again with a random refactor.

Is there a way to run a test to make sure that the worker compiles? Seems like an easy thing to break again with a random refactor.

I'm not sure there's a good way to do this in an automated way, given that it seems like the issue was related to the behavior of webpack-dev-server and the react-refresh plugin, which is specific to the dev workflow in the browser.

just to confirm then... was this not broken on prod?

I do think we should add a lengthy comment at the top of layout.ts regardless.

This revision is now accepted and ready to land.Jul 29 2021, 4:51 PM

just to confirm then... was this not broken on prod?

I'm not sure how to verify this without a large DAG to test it on, but I think that it most likely is not broken in prod.

I do think we should add a lengthy comment at the top of layout.ts regardless.

Will do.

Comment regarding React dependency