Page MenuHomePhabricator

[RFC] DagsterInstance.optimize_for_dagit
ClosedPublic

Authored by alangenfeld on Thu, Oct 8, 10:21 PM.

Details

Summary

This diff introduces a call on the DagsterInstance to allow it to optimize for being the central copy used in the long running dagit process. This allows us to control two behaviors

  • connection re-use / pooling
  • statement timeouts

that we only really *know* what we want if its the copy used by dagit. We don't want long running connections on random instances of DagsterInstance, and we don't want an aggressive statement_timeout breaking things like instance migrations when used in the CLI.

Since its difficult to thread this option in at construction time due to the existing DagsterInstance / configurable class / instance ref machinery, adding this opt-in function was my best idea. Open to other suggestions.

Test Plan

added tests
loaded dagit runs page with an rds database in us-west from my laptop here in MN, time drops from 24 seconds to 4 seconds since it can re-use connections.

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

the blame rev for null pool was https://dagster.phacility.com/D1694 so dont want to regress in that direction either - but I believe there has to be some way to keep 1 connection open

we might need a concept of long lived vs short lived on DagsterInstance to thread down to these places - this behavior is only really key in dagit's long lived copy

alangenfeld retitled this revision from [dagster-postgres] use default connection pooling to RFC [dagster-postgres] connection pooling.Thu, Oct 8, 11:09 PM
alangenfeld edited the summary of this revision. (Show Details)

This seems straightforward enough. Any idea why we went with NullPool in the first place?

This revision is now accepted and ready to land.Fri, Oct 9, 12:33 AM

Any idea why we went with NullPool in the first place?

ya see the comment above about D1694, I don't think this exact form of the diff is what we want to land

I think we need a test that spins up like, several dozen instances pointed at a single psql

alangenfeld edited the summary of this revision. (Show Details)

big change - will redo summary

alangenfeld retitled this revision from RFC [dagster-postgres] connection pooling to [RFC] DagsterInstance.optimize_for_dagit.Mon, Oct 12, 10:07 PM
alangenfeld edited the summary of this revision. (Show Details)
alangenfeld edited the test plan for this revision. (Show Details)
alangenfeld added reviewers: schrockn, dgibson.
alangenfeld edited the summary of this revision. (Show Details)
alangenfeld edited the test plan for this revision. (Show Details)

up

test assets page - add PythonError

also make psycopg2 gevent friendly

This revision is now accepted and ready to land.Thu, Oct 15, 7:59 PM
schrockn added inline comments.
python_modules/dagit/dagit/cli.py
121–125

why is this not just an argument to the ctor? Strongly prefer that to a mutating method

121–125

whoops accidental comment. have discussed with @alangenfeld in DMs